Getting Data In

When a file is already set to index to splunk, and that file gets overwritten with updates, meaning instead of file being appended, does splunk smart enough to not to reindex already indexed data and only index what's newly aded?


Instead of file being appended, if the file gets overwritted or rewrited, does splunk re-evaluates the entire file data and figure-out whether to index the already indexed data?

Tags (2)

Splunk Employee
Splunk Employee

I'm not sure of the exact scenario you have in mind:

  1. The file is deleted or truncated and new data is rewritten from the start; or
  2. The file is written over the beginning with the same old contents up to the point where it was before, then a couple of new lines are added.

In the first case, Splunk will have no problems detecting the new data. In the second case, unless the old data is written faster than Splunk can detect that it has been changed/deleted, it will probably wind up double-indexing the old data. If the old file is rewritten fast enough (or moved/renamed over the old one) then there won't be any problems.