Getting Data In
Highlighted

Why does Splunk (re-)index this rolled file? How to troubleshoot?

Influencer

Inputs stanza from btool:

[monitor:///apps/Logs/*/www/Reporting/CRTLog.log*]
_rcvbuf = 1572864
disabled = 0
host = apphost1
index = reporting_main
sourcetype = reporting_crtlog

The log rotation they use keeps 10 rolled copies, named with .1-10 on the end. Eg, when the original rolls it gets named CRTLog.log.1 and a new CRTLog.log file is created. Standard stuff.

I have confirmed, without a doubt, the rolled files maintain consistent content. I wrote a script to grab checksums of the first 1KB of each file every few seconds. They always check out -- .1's checksum matches what the original showed before rolling.

However, Splunk is sometimes (not all the time) treating the 1st rolled file as a new file:

 WatchedFile - Will begin reading at offset=0 for file='/apps/Logs/apphost1/www/Reporting/CRTLog.log.1'

Probably 30% of the time it re-reads the rolled file. Only .1, never any of the others.

Any tips to further troubleshoot this?

(Ticket's open, but after 3 days I kinda need an answer.)

EDIT: Sample checksum comparo:

I use for f in $(ls); do echo -n "$f: "; head -50 $f | md5sum; done to grab a list:

CRTLog.log: 0fb375c11ad382eec3cc482fb1332c81  -
CRTLog.log.1: 40f3878392f5ca816bfc4948b263d0e2  -
CRTLog.log.10: ffc1a6dec71a64f69a2f4c42b53d68cb  -
CRTLog.log.2: a3b7d786d8aa7260cc5e46635e764c8f  -
<snip>

Then wait for a roll to fire and grab the new list:

CRTLog.log: ad978fdb89b04169e95ba96c15887042  -
CRTLog.log.1: 0fb375c11ad382eec3cc482fb1332c81  -
CRTLog.log.10: 82d1b645c89e4e34b4e0a89712d30f3e  -
CRTLog.log.2: 40f3878392f5ca816bfc4948b263d0e2  -
CRTLog.log.3: a3b7d786d8aa7260cc5e46635e764c8f  -
<snip>

So the first 50 lines (about 16 KB worth of data), matches before and after roll to .1. Splunk re-read the file in this case.

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Esteemed Legend

I don't know why but you should just blacklist the *log.1 file and be done with it.

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Influencer

I have a sneaking feeling I would just see .2 show up as a dup. So the next step would be to drop the * and just log the original... but then we get missed logs. (Busy log file)

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

SplunkTrust
SplunkTrust

Why even do an asterisk after .log in the monitor line? As long as they have been indexed when CRTLog.log, no need to even look at them ever again:

[monitor:///apps/Logs/*/www/Reporting/CRTLog.log]

If this is pushed out to a new host via the deployment server, I can see why you would want the old files indexed, but that is the only case I can see for adding the * on the end of the line.

One more case for not having the asterisk is that it requires less CPU and memory to look at just one file vs. 11 files.

Just tryin' to keep it simple. 🙂

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Influencer

It's a busy log file. It often rolls before Splunk has finished reading the last X entries. Including the rolled files in the monitor entry is best practice -- if not officially from Splunk, definitely in my experience. Usually it works fine, I'm just at a loss to explain why it's failing in this case.

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Communicator

It looks like a pretty standard inputs.conf stanza.... How about the CRTLog.log* in your monitor line ... [monitor:///apps/Logs/*/www/Reporting/CRTLog.log*] ... Have you tried without the * at the end and just have [monitor:///apps/Logs/*/www/Reporting/CRTLog.log] otherwise I like the blacklist idea from woodcock or maybe have the log-roll name changed?

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Influencer

The identification of files regardless of name to handle rolled files is a core feature of splunk. And in this case, it's required for us. Without the asterisk we very noticeably miss log entries. Currently our choice is to miss log entries or have double entries. Not optimal! 🙂

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Communicator

I didn't realize my question was already asked ... sorry about that.

A recent issue I had concerning getting the data in ... I had to remove my * and pull in the whole directory. My [monitor:///Logs/isam/reports/access.log*] became [monitor:///Logs/isam/reports/access.log/] and that worked for me.. It had to monitor the whole directory instead of the wildcard on the log name. I also kept running into a problem with the whitelist parameter so I dropped that. I worked with $SPLUNK_HOME/bin/splunk list monitor to show me which files/directories are being monitored (ran on my UF)... This highlighted a regex issue I had with escaping a character incorrectly in another stanza. Good Luck.

0 Karma
Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Communicator

I didn't realize my question was already asked ... sorry about that.

A recent issue I had concerning getting the data in ... I had to remove my * and pull in the whole directory. My [monitor:///Logs/isam/reports/access.log*] became [monitor:///Logs/isam/reports/access.log/] and that worked for me.. It had to monitor the whole directory instead of the wildcard on the log name. I also kept running into a problem with the whitelist parameter so I dropped that. I worked with $SPLUNK_HOME/bin/splunk list monitor to show me which files/directories are being monitored (ran on my UF)... This highlighted a regex issue I had with escaping a character incorrectly in another stanza. Good Luck.

Highlighted

Re: Why does Splunk (re-)index this rolled file? How to troubleshoot?

Influencer

EDIT: Spoke too soon. Just got lucky with a string of good rolls. The 13th one failed. Same scenario. Sigh.

This looks like the fix (EDIT: nope).

Bad:

[monitor:///apps/Logs/*/www/Reporting/CRTLog.log*]

Good:

[monitor:///apps/Logs/*/www/Reporting/]
whitelist = CRTLog

That seems like a bug to me. Not sure what's triggering it because I use the "Bad" style above in literally a thousand different scenarios. This is the first that's bitten me.

Thanks!

0 Karma