I am trying to make three Search Head Cluster Member Bootstrap and it is failing with the following error:
You were unable to bootstrap the 3 Search Head Cluster Member and they were reporting
I have gone through the setup of the search head cluster from scratch several times now, and I am getting the following error:
01-23-2015 00:44:46.543 +0000 ERROR SHPoolMasterPeerHandler - Cannot add peer=10.0.0.101 mgmtport=8089 (reason: removeOldPeer peer=D171EF5C-6D34-4320-B7D1-DA68F39561DD, serverName=ip-10-0-0-101, hostport=10.0.0.101:8089, but found different peer=D171EF5C-6D34-4320-B7D1-DA68F39561DD with serverName=ip-10-0-0-102 and hostport=10.0.0.102:8089 already registered and UP
The issue was caused as all the Search Head Cluster members were installed using the same image resulting in same GUID which caused the conflict.
To resolve these issues, we performed the steps below and re-build the environment.
1) Stop all SHC members.
2) Delete GUID entry from $SPLUNK_HOME/etc/instance.cfg on all three Search Head Peers.
3) Clear _raft folder from each SH Cluster ($SPLUNk_HOME/ var/run/splunk/_raft)
4) Removed the Search Head cluster stanza on each peer $SPLUNK_HOME/etc/system/local/server.cfg
Stanza like:
[shclustering]
conf_deploy_fetch_url = https://<DEPLOY_SERVER_URI>:<DEPLOY_SERVER_MANAGMENT_PORT>
disabled = 0
mgmt_uri = https://<SHCM_URI>:<SHC_ MANAGMENT_PORT>
replication_factor = 3
id = 54F906ED-210B-4EB8-89E3-65266711271
5) Restart the Search Head Members.
6) Initialize the Search Head Cluster Member:
splunk init shcluster-config -auth : -mgmt_uri : -replication_port -replication_factor -conf_deploy_fetch_url : -secret security_key splunk restart
7) Bootstrap SH Cluster Members – and it worked.
I was trying to install a 3 member shc on a single box (my laptop). I had not set distinct "servername" for each member in etc/local/server.conf. I did "splunk clean all" on all members, set a distinct servername in server.conf for each, restarted each node, and reran the bootstrap command on the node I want to be captain (bootstrap captain). Now when I do "splunk show shclusters-status" I can see all three members are up and I don't see the error in the log anymore.
I have installed Search Head Cluster(with 3 SHC members shc62_01 , shc62_02 and shc62_03) and
1) First you will install each of the member as normal splunk instance.
2) Install the Serch Head Cluster Deployer as well.
3) After that you initiate each of these intended Search Head Member for cluster using command like below Enabled instance with name shc62_01 , shc62_02 and shc62_03 for Search head Cluster using command
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_01_URI>:<shc62_01_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_01>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_02_URI>:<shc62_02_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_02>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_03_URI>:<shc62_03_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_03>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
2) This command can be executed on any Search Head cluster Member initiated in step 1
splunk bootstrap shcluster-captain -servers_list "https://<shc62_01_URI>:<shc62_01_MgmtPort> , https://<shc62_02_URI>:<shc62_02_MgmtPort> ,https://<shc62_03_URI>:<shc62_03_MgmtPort>" -auth admin:<admin_Password>
3)Check status of SHC member using command
./splunk show shcluster-status
This information is also provided at link ----http://docs.splunk.com/Documentation/Splunk/6.2.1/DistSearch/SHCdeploymentoverview
The issue was caused as all the Search Head Cluster members were installed using the same image resulting in same GUID which caused the conflict.
To resolve these issues, we performed the steps below and re-build the environment.
1) Stop all SHC members.
2) Delete GUID entry from $SPLUNK_HOME/etc/instance.cfg on all three Search Head Peers.
3) Clear _raft folder from each SH Cluster ($SPLUNk_HOME/ var/run/splunk/_raft)
4) Removed the Search Head cluster stanza on each peer $SPLUNK_HOME/etc/system/local/server.cfg
Stanza like:
[shclustering]
conf_deploy_fetch_url = https://<DEPLOY_SERVER_URI>:<DEPLOY_SERVER_MANAGMENT_PORT>
disabled = 0
mgmt_uri = https://<SHCM_URI>:<SHC_ MANAGMENT_PORT>
replication_factor = 3
id = 54F906ED-210B-4EB8-89E3-65266711271
5) Restart the Search Head Members.
6) Initialize the Search Head Cluster Member:
splunk init shcluster-config -auth : -mgmt_uri : -replication_port -replication_factor -conf_deploy_fetch_url : -secret security_key splunk restart
7) Bootstrap SH Cluster Members – and it worked.
Hi rbal_splunk
Could you please help me in running bootstrap command for Sh cluster members Captain declaration. I am able to successfully able to initialize the shcluster as mentioned in docs but while running bootstrap coimmand getts following error " command error : bootstrap is not valid command" . AM i missing something ?
Please help!!!
Hi @Sourabhv05. I have @rbal_splunk next to me and she wanted to ask "Before bootstrapping, did you initiate each search head cluster member?" Also, @rbal_splunk's answer below was intended for you to see if that process will solve your issue.