<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why are my files being re-indexed? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-my-files-being-re-indexed/m-p/48654#M9238</link>
    <description>&lt;P&gt;crcSalt = &lt;BR /&gt;
followtail = 1  &lt;/P&gt;

&lt;P&gt;crcSalt = &lt;BR /&gt;
* Use this to force Splunk to consume files with matching CRCs.&lt;BR /&gt;
* Set any string to add to the CRC.&lt;BR /&gt;
* If set to "crcSalt = ", then the full source path is added to the CRC.  &lt;/P&gt;

&lt;P&gt;Im assuming after the upgrade splunk is reading a different CRC, and this is causing the double indexing.&lt;/P&gt;</description>
    <pubDate>Tue, 14 Sep 2010 04:16:55 GMT</pubDate>
    <dc:creator>Genti</dc:creator>
    <dc:date>2010-09-14T04:16:55Z</dc:date>
    <item>
      <title>Why are my files being re-indexed?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-my-files-being-re-indexed/m-p/48653#M9237</link>
      <description>&lt;P&gt;I'm noticed tons of duplicate events and the following message in splunkd.log correlates with the time I started seeing the dupes. It also started after I upgraded from v4.0.9 to v4.1.4:&lt;/P&gt;

&lt;P&gt;"File too small to check seekcrc, probably truncated. Will re-read entire file=....."&lt;/P&gt;

&lt;P&gt;Does anyone know why this is occurring?&lt;/P&gt;

&lt;P&gt;My settings in inputs.conf include:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;crcSalt = &amp;lt;SOURCE&amp;gt;
followtail = 1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I've already checkd for the following and none of these apply:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Causes of reindexing:&lt;/STRONG&gt; &lt;/P&gt;

&lt;P&gt;File contents (especially the first 256 bytes) are modified in-place. This shouldn't happen for log files (they're supposed to be a record). &lt;/P&gt;

&lt;P&gt;The CHECK_METHOD for the files was set to entire_md5 or modtime. This forces the files to be reindexed. &lt;/P&gt;

&lt;P&gt;Some sourcetypes like 'text_file' intentionally set the CHECK_METHOD because it is desired to index the complete file each time. &lt;/P&gt;</description>
      <pubDate>Sun, 12 Sep 2010 21:28:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-my-files-being-re-indexed/m-p/48653#M9237</guid>
      <dc:creator>SK110176</dc:creator>
      <dc:date>2010-09-12T21:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Why are my files being re-indexed?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-are-my-files-being-re-indexed/m-p/48654#M9238</link>
      <description>&lt;P&gt;crcSalt = &lt;BR /&gt;
followtail = 1  &lt;/P&gt;

&lt;P&gt;crcSalt = &lt;BR /&gt;
* Use this to force Splunk to consume files with matching CRCs.&lt;BR /&gt;
* Set any string to add to the CRC.&lt;BR /&gt;
* If set to "crcSalt = ", then the full source path is added to the CRC.  &lt;/P&gt;

&lt;P&gt;Im assuming after the upgrade splunk is reading a different CRC, and this is causing the double indexing.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Sep 2010 04:16:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-are-my-files-being-re-indexed/m-p/48654#M9238</guid>
      <dc:creator>Genti</dc:creator>
      <dc:date>2010-09-14T04:16:55Z</dc:date>
    </item>
  </channel>
</rss>

