Getting Data In

Bucket replication queue full causing random indexer slowdown.

sbhale
Explorer

Had a weird issue where my queues would fill up on random nodes and rove around within the cluster.
Had a case opened with support and Was working through and making all sorts of adjustments and ruling out all sorts of issues to no vail.
Finally had a breakthrough when I noticed that we were seeing
INFO BucketReplicator - replication queue for peer=<guid> bid=<bid> is full.
Followed almost immediately by
INFO BucketReplicator - replication queue for peer=<guid> bid=<bid> has room now.
over and over again. The gap between those two messages was only a few milliseconds.

No other obvious ERROR pointing to the cause.

Tags (1)
1 Solution

sbhale
Explorer

Answering my own question so others will find it useful.

The presense of the above messages with the same peer guid was ruled to be the problem.
One of our peer nodes was acting up and slowing down any nodes replicating to it just a little bit but enough that it was a propagating and causing queues to get backed up.
The solution was putting the node in manual detention to be either re-built or retired.

View solution in original post

sbhale
Explorer

Answering my own question so others will find it useful.

The presense of the above messages with the same peer guid was ruled to be the problem.
One of our peer nodes was acting up and slowing down any nodes replicating to it just a little bit but enough that it was a propagating and causing queues to get backed up.
The solution was putting the node in manual detention to be either re-built or retired.

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...