Deployment Architecture

What causes duplicate data in an indexer cluster?


We're seeing an issue on our indexer cluster where ~25% of events are duplicated. The raw logs do not contain duplicates, nor are there duplicate or overlapping monitor stanzas. When looking at bucket ID, Index Time, and Splunk Server, all are identical across the duplicates.

Our indexers are clustered, and we're running Enterprise version 6.6.3 on Windows Server 2012 R2.

Here's our aggregated outputs.conf from a Universal Forwarder:

\splunk btool outputs list

maxEventSize = 1024
priority = <13>
type = udp
ackTimeoutOnShutdown = 30
autoLBFrequency = 30
autoLBVolume = 0
blockOnCloning = true
blockWarnThreshold = 100
compressed = false
connectionTimeout = 20
defaultGroup = gd_indexers
disabled = false
dropClonedEventsOnQueueFull = 5
dropEventsOnQueueFull = -1
ecdhCurves = prime256v1, secp384r1, secp521r1
forceTimebasedAutoLB = false
forwardedindex.0.whitelist = .*
forwardedindex.1.blacklist = _.*
forwardedindex.2.whitelist = (_audit|_introspection|_internal|_telemetry)
forwardedindex.filter.disable = false
heartbeatFrequency = 30
indexAndForward = false
maxConnectionsPerIndexer = 2
maxFailuresPerInterval = 2
maxQueueSize = auto
readTimeout = 300
secsInFailureInterval = 1
sendCookedData = true
sslQuietShutdown = false
sslVersions = tls1.2
tcpSendBufSz = 0
useACK = true
writeTimeout = 300
server = <List of Internal IPs>

If anyone can suggest an avenue for troubleshooting, it would be greatly appreciated. Please also let me know if I can provide more relevant information.

0 Karma


You said you looked at indextime. Did that include looking at the indextime for both copies of the same event?
Pick a few events that are duplicated and look at any differences between the events.
indextime, host, splunkserver... is there anything you can see as different?


0 Karma


any difference between the original and it duplicate/s.
i.e. for each event, how does it differ from its duplicate? Is there only 1 copy or more of each of the duplicates?

0 Karma


I used the following search:

index=<my_index> sourcetype=<my_sourcetype>
| eval bucket=_bkt
| eval indextime=_indextime
| table _time, indextime, bucket splunk_server _raw
| convert ctime(indextime)
| stats count list(*) as * by _raw
| where count>1
| fields * _raw

Under the indextime field, I saw one value repeated for each of the duplicate events, same with bucket and splunk_server.

There appears to be no difference between duplicates, aside from occasionally there are 3 to 5 copies in an indexer, but most of the time just two copies. It's not always the same indexer either, it seems relatively evenly distributed.

0 Karma

0 Karma


We're working on upgrading to 7.2.x as soon as we can get it scheduled. The linked question looks like it's talking about 6.4 as the solution; we're on 6.6. Appreciate you taking the time to post a suggestion though!

0 Karma
Get Updates on the Splunk Community!

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...

What's New in Splunk Cloud Platform 9.0.2208?!

Howdy!  We are happy to share the newest updates in Splunk Cloud Platform 9.0.2208! Analysts can benefit ...

Admin Console: A Single, Unified Interface for All Your Cloud Admin Needs

WATCH NOWJoin us to learn how the admin console can save you time and give you more control over the Splunk® ...