topic How to manage indexing rolling log files without duplicating data in the Index in Getting Data In

How to manage indexing rolling log files without duplicating data in the Index

ericrobinson — Tue, 15 Mar 2011 23:51:16 GMT

We are testing in a high throughput environment capturing logs that grow to 251MB in ~ 4-6 minutes at which time the logs are rolled to a dated log file.

e.g. test.log -> test.log.20110315042946

The problems is that Splunk thinks we have already indexed one or more of the rolled log files, and results in us missing data from the performance run. I have read about using the crcSalt but to avoid using that on rotating log files.

03-15-2011 09:38:04.028 ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match (file=/opt/perf/gett/log/test.log.20110315091120). Last time we saw this initcrc, filename was different. You may wish to use a CRC salt on this source. Consult the documentation or contact Splunk Support for more info.

Can someone suggest how this problem can be managed?

Re: How to manage indexing rolling log files without duplicating data in the Index

netwrkr — Wed, 16 Mar 2011 03:07:32 GMT

Could you name the log file with the associated date / time value at the beginning rather than changing it afterwards?

Re: How to manage indexing rolling log files without duplicating data in the Index

gkanapathy — Wed, 16 Mar 2011 04:33:05 GMT

Are the files simply renamed when they are rolled? What is the inputs.conf stanza that you are using to monitor the files?

Re: How to manage indexing rolling log files without duplicating data in the Index

ericrobinson — Thu, 17 Mar 2011 00:12:07 GMT

Hi All.. Thanks for the help. We found that the rolling log file was also being renamed by another log archiving process.

What was happenning was the log would be rolled to test.log.1

Then, the archving process would rename it to test.log.20110316

We think that Splunk was seeing the log in the .1 format and when the file name changed to .2011*, the CRC had issues.

After changing our inputs.conf, we are not seeing the issue..

We were monitoring test.log* and now only monitor test.log and test.log.2011*