Deployment Architecture
Highlighted

Search head cluster cluster rolling restart issue

Builder

Hi,

I am having an issue with my SH cluster.. Was working fine, now there are no members. Captain is elected dynamically. All of the _flag options are 0 under the status. It seems as though none of the peers want to join. There are no errors in splunkd that imply there is a problem or related to an issue with this. If it were an issue with a pass4SymmKey change this would be represented in the logs surely?

Any thoughts ?

0 Karma
Highlighted

Re: Search head cluster cluster rolling restart issue

Motivator

Can you post the results of splunk show shcluster-status?

0 Karma
Highlighted

Re: Search head cluster cluster rolling restart issue

Builder

Captain:
dynamiccaptain : 1
elected
captain : captain
id : id
initializaedflag : 0
label : label
mgmt
uri : uri
minpeersjoinedflag : 0
rolling
restartflag : 0
service
ready_flag : 0

..Members:

..Doesnt list anything. When I restart the captain, it will elect a new captain as normal, however it never displays any members/no members join.. Nothing in the DMC either.

0 Karma
Highlighted

Re: Search head cluster cluster rolling restart issue

Builder

Fixed using the following method:
Rebuilt the SHC using the KVstore from one of the members.
Followed the following steps:
1. *Stop all SHC members and copy the kvstore *cd $SPLUNK_HOME/var/lib/splunk/kvstore tar cvfz kvstore-.tar.gz * Move to a safe place.

  1. Remove configuration of the cluster from all members. Since some were not part of the cluster and some that were part of the cluster did not exist we could not use the remove CLI. Delete [shclustering] stanza from server.conf
  2. Clean raft and mongod folders *rm -rf $SPLUNKHOME/var/run/splunk/raft/ rm -rf $SPLUNKHOME/var/lib/splunk/kvstore/mongo/* 4.* Verify all members have replicationfactor *= 3 $SPLUNKHOME/bin/splunk btool server list shclustering | grep replicationfactor
  3. *Start all members *$SPLUNK_HOME/bin/splunk start
  4. *Initialize all members *(Note: Use command to deploy SHC with Indexer cluster if SHC is part of IDX Cluster) splunk init shcluster-config -auth admin:changed -mgmturi https://sh1.example.com:8089 -replicationport 34567 -replicationfactor 3 -confdeployfetchurl https://:8089 -secret mykey -shcluster_label shcluster1
  5. ONLY on member that will be first captain. Restore kvstore. $SPLUNKHOME/bin/splunk stop cd $SPLUNKHOME/var/lib/splunk/kvstore tar xvfz /kvstore-.tar.gz $SPLUNKHOME/bin/splunk clean kvstore --cluster $SPLUNKHOME/bin/splunk start
  6. *Bootstrap first member *splunk bootstrap shcluster-captain -servers_list ":" -auth :
  7. *Verify kvstore *is working fine and available splunk show shcluster-status -auth :
  8. Add the rest of the members to the cluster *splunk add shcluster-member currentmemberuri "https://hostname:mngmt_port" (if its a new member change "current", to "new", if new, it must be done from Captain, if existing it must be done from the host itself). *Afterwards a resync of config bundle may need to be done. Check status on DMC for this. If so run command /opt/splunk/bin/splunk resync shcluster-replicated-config

View solution in original post