Getting Data In

Why are my files being re-indexed?

Path Finder

I'm noticed tons of duplicate events and the following message in splunkd.log correlates with the time I started seeing the dupes. It also started after I upgraded from v4.0.9 to v4.1.4:

"File too small to check seekcrc, probably truncated. Will re-read entire file=....."

Does anyone know why this is occurring?

My settings in inputs.conf include:

crcSalt = <SOURCE>
followtail = 1

I've already checkd for the following and none of these apply:

Causes of reindexing:

File contents (especially the first 256 bytes) are modified in-place. This shouldn't happen for log files (they're supposed to be a record).

The CHECK_METHOD for the files was set to entire_md5 or modtime. This forces the files to be reindexed.

Some sourcetypes like 'text_file' intentionally set the CHECK_METHOD because it is desired to index the complete file each time.

Tags (1)

Splunk Employee
Splunk Employee

crcSalt =
followtail = 1

crcSalt =
* Use this to force Splunk to consume files with matching CRCs.
* Set any string to add to the CRC.
* If set to "crcSalt = ", then the full source path is added to the CRC.

Im assuming after the upgrade splunk is reading a different CRC, and this is causing the double indexing.

Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...