I've been all over related questions in Splunk base, but I have not found out why exactly Splunk will sometime index duplicate events. A simple dedup will help mitigate this issue but does not get to the core of the problem.
I'm indexing mutiple logs from a global file system so my input.conf would look like this.
[monitor://global/file/system/apache/log/nodes*/access_log] index = log_index
The duplicate number of events is not consistent. The number is usually between 2 an 12.
Should I add crcSalt option?
The Other option im using is setting the maxKBps = 56 on the forwarder, will this have any impact on the main indexer?
Thanks for the response. I took a look at both (forwarder and indexer) splunkd.log files and I did not see any WARN lines concerning possible duplicate events. I'm thinking it might be the way our global file system is set up since our logs reside on a global mount using symlinks.
I've seen duplicate logs caused by the following:
Is the log being rotated? If so, then monitor only the current log.
Is there a link that duplicates the contents to another monitored directory? If so, then remove the link, or blacklist one of them.