Deployment Architecture

"Got Done key for an unknown bucket without getting any data"

nickhills
Ultra Champion

I have a MultiSite indexing cluster, and a single Peer int site 2 is repeatedly reporting:

ERROR TcpInputProc - event=replicationData status=failed err="Got Done key for an unknown bucket without getting any data"

150 or so times at once.

restarting the peer (with cluster maint mode set) has had no effect.

Obviously there is no indication if this is one bucket many times, or lots of buckets.
I'm at a bit of loss how to diagnose further, but wonder if I should stop the indexer and let the cluster fixup, and then bring the indexer back online, unless anyone can provide a more scientific approach.

If my comment helps, please give it a thumbs up!

bohanlon_splunk
Splunk Employee
Splunk Employee

Reading this might help:
https://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/Anomalousbuckets

This query might also be useful:

| rest /services/cluster/master/info 
| fields buckets_to_fix.*.latest.reason 
| transpose 
| rename column AS bucket "row 1" AS reason 
| rex field=bucket "buckets_to_fix\.(?.*?)\.latest\.reason" 
| rex field=bucket "(?[^~]*?)~" 
| rex mode=sed field=reason "s/[0-9A-Z]{8}-[0-9A-Z]{4}-[0-9A-Z]{4}-[0-9A-Z]{4}-[0-9A-Z]{12}/{PEER}/" 
0 Karma

DalJeanis
Legend

@bohanlon [Splunk] - marked the query code but it's still missing the tags. Line 4 looks odd, too, please repost.

0 Karma
Get Updates on the Splunk Community!

Announcing the Expansion of the Splunk Academic Alliance Program

The Splunk Community is more than just an online forum — it’s a network of passionate users, administrators, ...

Learn Splunk Insider Insights, Do More With Gen AI, & Find 20+ New Use Cases You Can ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Buttercup Games: Further Dashboarding Techniques (Part 7)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...