Splunk Search

Can anyone help clarify why splunk sometimes indexes duplicate events from one log file?

ten_yard_fight
Path Finder

I've been all over related questions in Splunk base, but I have not found out why exactly Splunk will sometime index duplicate events. A simple dedup will help mitigate this issue but does not get to the core of the problem.

My Scenario:
I'm indexing mutiple logs from a global file system so my input.conf would look like this.

[monitor://global/file/system/apache/log/nodes*/access_log]
index = log_index

The duplicate number of events is not consistent. The number is usually between 2 an 12.
Should I add crcSalt option?
The Other option im using is setting the maxKBps = 56 on the forwarder, will this have any impact on the main indexer?

0 Karma

vinodmadaan
Path Finder
0 Karma

amit_saxena
Communicator

Hi,

Check "splunkd.log" for the following pattern to check if forwarder resends a data block.

WARN TcpOutputProc - Possible duplication of events with channel=

Regards,
Amit Saxena

0 Karma

lukejadamec
Super Champion

Links between monitored files or directories will cause duplicates. Remove the links or blacklist duplicates.

0 Karma

ten_yard_fight
Path Finder

Hi Amit,

Thanks for the response. I took a look at both (forwarder and indexer) splunkd.log files and I did not see any WARN lines concerning possible duplicate events. I'm thinking it might be the way our global file system is set up since our logs reside on a global mount using symlinks.

0 Karma

lukejadamec
Super Champion

I've seen duplicate logs caused by the following:
Is the log being rotated? If so, then monitor only the current log.
Is there a link that duplicates the contents to another monitored directory? If so, then remove the link, or blacklist one of them.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...