Getting Data In

Log file is not getting indexed

Engager

We have a custom application
log file which looks something like below, this file is not getting indexed with the 1st 4 lines in it.
These logs are generated for a number of similar programs the entries in "XXXXs..." will vary based on this

=============================================================================================================================================================================================

Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)

Application ---------------------------------------------------------------

Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)

Application ---------------------------------------------------------------

Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal startdate="09-APR-2013 17:01:02" enddate="N/A" requestid=37541696 currtime="09-APR-2013 17:15:13"

Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start
date="09-APR-2013 17:01:03" enddate="N/A" requestid=37541697 currtime="09-APR-2013 17:15:13"

==============================================================================================================================================================================================

Same file (below) with the discarded lines works fine.

==============================================================================================================================================================================================

Application prg="XXXXX XXXXXX - Receipt Creation Program" phase=Running status=Normal startdate="09-APR-2013 17:01:02" enddate="N/A" requestid=37541696 currtime="09-APR-2013 17:15:13"

Application prg="XXXXX XXXXXX- Receipt Creation Program" phase=Running status=Normal start
date="09-APR-2013 17:01:03" enddate="N/A" requestid=37541697 currtime="09-APR-2013 17:15:13"

===============================================================================================================================================================================================

Is it possible to ignore or omit the 4 lines for the file to be indexed since it is not going to be possible to remove the entries from the applciation side to remove these lines.
Thanks

Tags (1)
0 Karma

Contributor

If I read your question correct, your file is not indexed.
Am I right in assuming, that the start of the file is identical with other files for the first 256 bytes?

You might see some lines in splunkd.log that looks like this: "File will not be read, is too small to match seekptr checksum" I use the following search:

index=_internal source=*splunkd.log "File will not be read, is too small to match seekptr checksum" component="TailingProcessor" | dedup host file | table _time host file | sort host

Try to look at initCrcLength in inputs.conf, this option came in 5.0.1

0 Karma

Ultra Champion

Oh..reading your answer, I think you understood better what the problem may be.

0 Karma

Ultra Champion

You should probably read the following for guidance on how to skip indexing of some events.

http://docs.splunk.com/Documentation/Splunk/5.0.2/Deploy/Routeandfilterdatad#Keep_specific_events_an...

In props.conf:

[your_sourcetype]
TRANSFORMS-set= setnull,setparsing

In transforms.conf:

[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setparsing]
REGEX = some_string
DEST_KEY = queue
FORMAT = indexQueue

You'll have to replace 'somesting' with a something that distinguishes the lines you want to keep, e.g. in your example "phase" or "`currdate`" occur in the events you want to keep.

When these transforms are called from props.conf, the order is important; first ALL events are set to be thrown away (setnull), then followed by the second transform (setparsing) that re-set the destination for the matching events from the nullQueue back to the indexQueue.

Hope this helps,

Kristian

0 Karma