Deployment Architecture

Splunk Search Head Cluster 6.5.3 issue - cluster members stop running scheduled searches

ebaileytu
Communicator

We have a 6.5.3 - 9 node search cluster where at any time 2 members will stop running scheduled searches which is causing scaling issue.

from the captain I see the following message for nodes that stop running scheduled searches

05-27-2017 12:16:08.110 -0500 WARN SHCMaster - did not schedule removal for peer='56B4C21B-0B26-4BA2-826C-148E069F5FD0', err='SHPMaster::scheduleRemoveArtifactFromPeer_locked: aid=scheduler_adminbatch_RMD5ebd970f44716db9c_at_1495904700_9854_2E1C054F-9A8B-4D4A-BBC0-29F0562C7AED peer="xxxxx", guid="56B4C21B-0B26-4BA2-826C-148E069F5FD0" is pending some change, status='PendingDiscard''

I restart these nodes but then they stop running schedules searches a couple of hours later.

I cannot find anything in the docs or in answers for this message. Do i need just need to rsync the baseline?

Thanks!

jkat54
SplunkTrust
SplunkTrust
0 Karma

skalliger
SplunkTrust
SplunkTrust

Did you try to do a

resync shcluster-replicated-config

from the two mentioned SH cluster members? I found one known issue from 6.5.3 but I'm not sure if that's your problem, you may want to contact the support if a resync doesn't fix your problem.

Skalli

0 Karma

ebaileytu
Communicator

I tried that and I also added executor_workers = 20 to server.conf but no change. I have an open case and hoping for a response soon. Thanks!

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!