We have following log file which we need to import in Splunk:
"cdrRecordType","globalCallID_callManagerId","globalCallID_callId","nodeId","directoryNum","callIdentifier","dateTimeStamp","numberPacketsSent","numberOctetsSent","numberPacketsReceived","numberOctetsReceived","numberPacketsLost","jitter","latency","pkid","directoryNumPartition","globalCallId_ClusterID","deviceName","varVQMetrics" INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,UNIQUEIDENTIFIER,VARCHAR(50),VARCHAR(50),VARCHAR(129),VARCHAR(600) 2,15,2768615,15,"10063114030",259142886,1471391005,827,121400,565,87061,0,0,0,"1014e40e-i061-2ii6-6cbb-q3e610140ec0","PART_FAKE_LINE1","FBSNEUC01","CIPCqcwecoe","MLQK=0.0000;MLQKav=0.0000;MLQKmn=0.0000;MLQKmx=0.0000;MLQKvr=null;CCR=0.0000;ICR=0.0000;ICRmx=0.0000;CS=0;SCS=0"
I am ignoring Headers using following config:
[collab_cm_cmr_data] pulldown_type = 1 SHOULD_LINEMERGE = false INDEXED_EXTRACTIONS = CSV FIELD_DELIMITER = , TRANSFORMS-header_nullq = header_nullq FIELD_QUOTE = " NO_BINARY_CHECK = true category = Cisco CMS Ver. 1 description = An comma delimited output of CM CMR file.
[header_nullq] DEST_KEY = queue REGEX = ^TimeStamp FORMAT = nullqueue
Similarly, I want to ignore the second line so I have added following configuration. But it's not working:
TRANSFORMS-null = discard_row
[discard_row] DEST_KEY = queue REGEX=^INTEGER FORMAT = nullqueue
SO basically I want to ignore both 1st and 2nd row. Can someone guide me with what is wrong with above config?
The question is not really clear. It looks like you want to use the first line as a csv field name input, and ignore the second line.
If that is the case, then you should be able to ignore the second line with props.conf (no entry in transforms required):
props.conf HEADER_FIELD_LIINE_NUMBER = 2
This is assuming that "INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,UNIQUEIDENTIFIER,VARCHAR(50),VARCHAR(50),VARCHAR(129),VARCHAR(600)" is line number 2.
I took your 3 lines, made multiple copies of line3 to grow the file, and then tried these configs (no transforms.conf)
inputs.conf [monitor://C:\temp\Splunk\test\ignoreLine2\test.txt] disabled = 0 index = test sourcetype = testtype props.conf [testtype] pulldown_type=1 SHOULD_LINEMERGE=false INDEXED_EXTRACTIONS=CSV HEADER_FIELD_LINE_NUMBER=2 FIELD_DELIMITER=, FIELD_QUOTE=" NO_BINARY_CHECK=true
The first time I tried without HEADER_FIELD_LINE_NUMBER=2 and I did get the line 2 in the test index.
The second time, I added the HEADER_FIELD_LINE_NUMBER=2 and replaced INTEGER with INTEGER2 and 2,15, with 3,16, so that the input file was changed enough to reindex, and after a Splunk restart did not get INTEGER2 in the index, but did get the events with 3,16.
Perhaps you are confusing Splunk with your transforms method of removing line 2.
Hi, I just test your suggestion, but with your configuration the fields are not present in splunk. Instead of that you will find fields like INTEGER, VARCHAR and so on. So for me is not a working solution.
Does somebody found another way?
Sorry if i made confusion while putting my question. Yes, I want to use the first line as a csv field name input, and ignore the second line.
I tried following 2 options separately as well, but no success
HEADER_FIELD_LINE_NUMBER = 2
PREAMBLE_REGEX = ^INTEGER
It ONLY effects data that comes in AFTER the splunk restart after you updated it. Once data hits the indexer, it is immutable and will stay in that format forever. Also, I don't see that
^TimeStamp will ever match anything (not in your example data, anyway).
The problem is there is a header on the firstline (indicates field names) and a "typer" as the second line (describing data types). It is the 2nd "typer" line that needs to be ignored.