Had a weird issue where my queues would fill up on random nodes and rove around within the cluster.
Had a case opened with support and Was working through and making all sorts of adjustments and ruling out all sorts of issues to no vail.
Finally had a breakthrough when I noticed that we were seeing
INFO BucketReplicator - replication queue for peer=<guid> bid=<bid> is full
.
Followed almost immediately by
INFO BucketReplicator - replication queue for peer=<guid> bid=<bid> has room now.
over and over again. The gap between those two messages was only a few milliseconds.
No other obvious ERROR pointing to the cause.
Answering my own question so others will find it useful.
The presense of the above messages with the same peer guid was ruled to be the problem.
One of our peer nodes was acting up and slowing down any nodes replicating to it just a little bit but enough that it was a propagating and causing queues to get backed up.
The solution was putting the node in manual detention to be either re-built or retired.
Answering my own question so others will find it useful.
The presense of the above messages with the same peer guid was ruled to be the problem.
One of our peer nodes was acting up and slowing down any nodes replicating to it just a little bit but enough that it was a propagating and causing queues to get backed up.
The solution was putting the node in manual detention to be either re-built or retired.