We're seeing a time frame correlation between "WatchedFile - Will begin reading at offset=..." and the time frames involved in the duplicate indexing.
My running theory is that the WatchedFile component is starting at an incorrect file offset, resulting in re-indexing of all events between assigned offset time and the time the WatchedFile component actually started reading. "IgnoreOlderThan" flag in inputs.conf may be playing a part if buckets are cleared on shutoff, the offset wouldn't matter and would re-index entire file daily.
... View more
I'm looking for possible reasons a single event would be indexed numerous times on our main indexers from our heavy forwarders. We have ignoreOlderThan set to 1d, but from the looks of it the file watcher is indexing events multiple times within a 10-20 minute period, around an hour after the event occurs, both before and after reboot. I've verified that the source file only contains 1 of each event, but our forwarders seem to be pushing multiple copies to the indexer and are showing up as duplicates in our counts. The servers that index the data are showing as different between the index times
Example of issue:
_time value = 2019-12-08 11:31:17.116
index timestamps/servers indexed =
indexer 3 -12/08/2019 04:27:51, 12/08/2019 05:36:19
indexer 8 -12/08/2019 06:39:04, 12/09/2019 05:40:10, 12/09/2019 06:47:59
indexer 9 -12/08/2019 07:45:06, 12/09/2019 07:34:19
indexer 10-12/08/2019 08:24:10, 12/09/2019 03:55:23
inputs.conf segment -
index = indexers
sourcetype = sourcetype
ignoreOlderThan = 1d
outputs.conf segment -
Turn off indexing on the local machine, we want the items indexed at the main indexers.
indexAndForward = false
Define which group of indexers we are sending to. We currently only have one.
defaultGroup = primary_indexers
maxQueueSize = 250MB
server = servers
autoLB = true
forceTimebasedAutoLB = true
... View more