Getting Data In

Log file is not getting indexed

yogonline
Engager

We have a custom application
log file which looks something like below, this file is not getting indexed with the 1st 4 lines in it.
These logs are generated for a number of similar programs the entries in "XXXXs..." will vary based on this

=============================================================================================================================================================================================

Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)

Application ---------------------------------------------------------------

Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)

Application ---------------------------------------------------------------

Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:02" end_date="N/A" requestid=37541696 curr_time="09-APR-2013 17:15:13"

Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:03" end_date="N/A" requestid=37541697 curr_time="09-APR-2013 17:15:13"

==============================================================================================================================================================================================

Same file (below) with the discarded lines works fine.

==============================================================================================================================================================================================

Application prg="XXXXX XXXXXX - Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:02" end_date="N/A" requestid=37541696 curr_time="09-APR-2013 17:15:13"

Application prg="XXXXX XXXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:03" end_date="N/A" requestid=37541697 curr_time="09-APR-2013 17:15:13"

===============================================================================================================================================================================================

Is it possible to ignore or omit the 4 lines for the file to be indexed since it is not going to be possible to remove the entries from the applciation side to remove these lines.
Thanks

Tags (1)
0 Karma

las
Contributor

If I read your question correct, your file is not indexed.
Am I right in assuming, that the start of the file is identical with other files for the first 256 bytes?

You might see some lines in splunkd.log that looks like this: "File will not be read, is too small to match seekptr checksum" I use the following search:

index=_internal source=*splunkd.log "File will not be read, is too small to match seekptr checksum" component="TailingProcessor" | dedup host file | table _time host file | sort host

Try to look at initCrcLength in inputs.conf, this option came in 5.0.1

0 Karma

kristian_kolb
Ultra Champion

Oh..reading your answer, I think you understood better what the problem may be.

0 Karma

kristian_kolb
Ultra Champion

You should probably read the following for guidance on how to skip indexing of some events.

http://docs.splunk.com/Documentation/Splunk/5.0.2/Deploy/Routeandfilterdatad#Keep_specific_events_an...

In props.conf:

[your_sourcetype]
TRANSFORMS-set= setnull,setparsing

In transforms.conf:

[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setparsing]
REGEX = some_string
DEST_KEY = queue
FORMAT = indexQueue

You'll have to replace 'some_sting' with a something that distinguishes the lines you want to keep, e.g. in your example "phase" or "curr_date" occur in the events you want to keep.

When these transforms are called from props.conf, the order is important; first ALL events are set to be thrown away (setnull), then followed by the second transform (setparsing) that re-set the destination for the matching events from the nullQueue back to the indexQueue.

Hope this helps,

Kristian

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

    Are you ready to transform how your team handles complex data requests? We invite you to our upcoming ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...