Deployment Architecture

Splunk Cluster Master Peer Handler Error

Path Finder

04-14-2014 13:03:09.199 -0700 ERROR ClusterMasterPeerHandler - Cannot add peer=x.x.x.x mgmtport=8089 (reason: non-zero pending job count=2090)

I've got walls of this error on the cluster master. Anyone know what's causing it and how serious of a problem it is?

Tags (3)

Path Finder

Initially the peer value consisted of all of my indexers, but as the count value wound down I was left with 2 peers showing the errors with a count value of less than 10.

Those 2 peers showed the following in their logs repeatedly.

04-17-2014 11:02:03.853 -0700 WARN CMSlave - handleHeartbeatDone: successful heartbeat and re-add not received but proxy is in disconnected state. Forcing re-add.
04-17-2014 11:02:03.853 -0700 INFO CMSlave - event=addPeer resetting masks for all buckets on clearAndReadd

Eventually the error subsided which I attribute to the cluster reaching a state of homeostasis with its replication rate because my search and replication factors are not met yet.