We have following log file which we need to import in Splunk:
"cdrRecordType","globalCallID_callManagerId","globalCallID_callId","nodeId","directoryNum","callIdentifier","dateTimeStamp","numberPacketsSent","numberOctetsSent","numberPacketsReceived","numberOctetsReceived","numberPacketsLost","jitter","latency","pkid","directoryNumPartition","globalCallId_ClusterID","deviceName","varVQMetrics"
INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,UNIQUEIDENTIFIER,VARCHAR(50),VARCHAR(50),VARCHAR(129),VARCHAR(600)
2,15,2768615,15,"10063114030",259142886,1471391005,827,121400,565,87061,0,0,0,"1014e40e-i061-2ii6-6cbb-q3e610140ec0","PART_FAKE_LINE1","FBSNEUC01","CIPCqcwecoe","MLQK=0.0000;MLQKav=0.0000;MLQKmn=0.0000;MLQKmx=0.0000;MLQKvr=null;CCR=0.0000;ICR=0.0000;ICRmx=0.0000;CS=0;SCS=0"
I am ignoring Headers using following config:
props.conf
[collab_cm_cmr_data]
pulldown_type = 1
SHOULD_LINEMERGE = false
INDEXED_EXTRACTIONS = CSV
FIELD_DELIMITER = ,
TRANSFORMS-header_nullq = header_nullq
FIELD_QUOTE = "
NO_BINARY_CHECK = true
category = Cisco CMS Ver. 1
description = An comma delimited output of CM CMR file.
transforms.conf
[header_nullq]
DEST_KEY = queue
REGEX = ^TimeStamp
FORMAT = nullqueue
Similarly, I want to ignore the second line so I have added following configuration. But it's not working:
props.conf
TRANSFORMS-null = discard_row
transforms.conf
[discard_row]
DEST_KEY = queue
REGEX=^INTEGER
FORMAT = nullqueue
SO basically I want to ignore both 1st and 2nd row. Can someone guide me with what is wrong with above config?
I found similar problem on another thread , now I am keeping eye on that thread as well
The question is not really clear. It looks like you want to use the first line as a csv field name input, and ignore the second line.
If that is the case, then you should be able to ignore the second line with props.conf (no entry in transforms required):
props.conf
HEADER_FIELD_LIINE_NUMBER = 2
This is assuming that "INTEGER,INTEGER,INTEGER,INTEGER,VARCHAR(50),INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,INTEGER,UNIQUEIDENTIFIER,VARCHAR(50),VARCHAR(50),VARCHAR(129),VARCHAR(600)" is line number 2.
I took your 3 lines, made multiple copies of line3 to grow the file, and then tried these configs (no transforms.conf)
inputs.conf
[monitor://C:\temp\Splunk\test\ignoreLine2\test.txt]
disabled = 0
index = test
sourcetype = testtype
props.conf
[testtype]
pulldown_type=1
SHOULD_LINEMERGE=false
INDEXED_EXTRACTIONS=CSV
HEADER_FIELD_LINE_NUMBER=2
FIELD_DELIMITER=,
FIELD_QUOTE="
NO_BINARY_CHECK=true
The first time I tried without HEADER_FIELD_LINE_NUMBER=2 and I did get the line 2 in the test index.
The second time, I added the HEADER_FIELD_LINE_NUMBER=2 and replaced INTEGER with INTEGER2 and 2,15, with 3,16, so that the input file was changed enough to reindex, and after a Splunk restart did not get INTEGER2 in the index, but did get the events with 3,16.
Perhaps you are confusing Splunk with your transforms method of removing line 2.
Hi, I just test your suggestion, but with your configuration the fields are not present in splunk. Instead of that you will find fields like INTEGER, VARCHAR and so on. So for me is not a working solution.
Does somebody found another way?
Sorry if i made confusion while putting my question. Yes, I want to use the first line as a csv field name input, and ignore the second line.
I tried following 2 options separately as well, but no success
HEADER_FIELD_LINE_NUMBER = 2
PREAMBLE_REGEX = ^INTEGER
You spelled nullQueue
wrong (casing matters). Fix that and restart and BINGO!
HI,
I tried this, but No success
[header_nullq]
DEST_KEY = queue
REGEX = ^TimeStamp
FORMAT = nullQueue
[discard_row]
DEST_KEY = queue
REGEX= ^INTEGER
FORMAT = nullQueue
It ONLY effects data that comes in AFTER the splunk restart after you updated it. Once data hits the indexer, it is immutable and will stay in that format forever. Also, I don't see that ^TimeStamp
will ever match anything (not in your example data, anyway).
Yes, after changes I restarted Splunk and modified input file so that it can be again consumed by SPlunk. But that didn't helped.
Thanks for pointing out ^TimeStamp
IGNORE: Also, add HEADER_MODE = firstline
to treat first line as header (will not get ingested).
The problem is there is a header on the firstline (indicates field names) and a "typer" as the second line (describing data types). It is the 2nd "typer" line that needs to be ignored.
Ohh I misread. I take that back.