Getting Data In

Why are my files being re-indexed?

Path Finder

I'm noticed tons of duplicate events and the following message in splunkd.log correlates with the time I started seeing the dupes. It also started after I upgraded from v4.0.9 to v4.1.4:

"File too small to check seekcrc, probably truncated. Will re-read entire file=....."

Does anyone know why this is occurring?

My settings in inputs.conf include:

crcSalt = <SOURCE>
followtail = 1

I've already checkd for the following and none of these apply:

Causes of reindexing:

File contents (especially the first 256 bytes) are modified in-place. This shouldn't happen for log files (they're supposed to be a record).

The CHECK_METHOD for the files was set to entire_md5 or modtime. This forces the files to be reindexed.

Some sourcetypes like 'text_file' intentionally set the CHECK_METHOD because it is desired to index the complete file each time.

Tags (1)

Re: Why are my files being re-indexed?

Splunk Employee
Splunk Employee

crcSalt =
followtail = 1

crcSalt =
* Use this to force Splunk to consume files with matching CRCs.
* Set any string to add to the CRC.
* If set to "crcSalt = ", then the full source path is added to the CRC.

Im assuming after the upgrade splunk is reading a different CRC, and this is causing the double indexing.