Deployment Architecture

How to troubleshoot Search Head Clustering initial bootstrap failing with error "found different peer with serverName and hostport already registered and UP"?

rbal_splunk
Splunk Employee
Splunk Employee

I am trying to make three Search Head Cluster Member Bootstrap and it is failing with the following error:

You were unable to bootstrap the 3 Search Head Cluster Member and they were reporting 

I have gone through the setup of the search head cluster from scratch several times now, and I am getting the following error:

01-23-2015 00:44:46.543 +0000 ERROR SHPoolMasterPeerHandler - Cannot add peer=10.0.0.101 mgmtport=8089 (reason: removeOldPeer peer=D171EF5C-6D34-4320-B7D1-DA68F39561DD, serverName=ip-10-0-0-101, hostport=10.0.0.101:8089, but found different peer=D171EF5C-6D34-4320-B7D1-DA68F39561DD with serverName=ip-10-0-0-102 and hostport=10.0.0.102:8089 already registered and UP
1 Solution

rbal_splunk
Splunk Employee
Splunk Employee

The issue was caused as all the Search Head Cluster members were installed using the same image resulting in same GUID which caused the conflict.

To resolve these issues, we performed the steps below and re-build the environment.

1) Stop all SHC members.
2) Delete GUID entry from $SPLUNK_HOME/etc/instance.cfg on all three Search Head Peers.
3) Clear _raft folder from each SH Cluster ($SPLUNk_HOME/ var/run/splunk/_raft)
4) Removed the Search Head cluster stanza on each peer $SPLUNK_HOME/etc/system/local/server.cfg

Stanza like:

[shclustering]
conf_deploy_fetch_url = https://<DEPLOY_SERVER_URI>:<DEPLOY_SERVER_MANAGMENT_PORT>
disabled = 0
mgmt_uri = https://<SHCM_URI>:<SHC_ MANAGMENT_PORT>
replication_factor = 3
id = 54F906ED-210B-4EB8-89E3-65266711271

5) Restart the Search Head Members.
6) Initialize the Search Head Cluster Member:

splunk init shcluster-config -auth : -mgmt_uri : -replication_port  -replication_factor  -conf_deploy_fetch_url : -secret security_key splunk restart 

7) Bootstrap SH Cluster Members – and it worked.

View solution in original post

ghendrey_splunk
Splunk Employee
Splunk Employee

I was trying to install a 3 member shc on a single box (my laptop). I had not set distinct "servername" for each member in etc/local/server.conf. I did "splunk clean all" on all members, set a distinct servername in server.conf for each, restarted each node, and reran the bootstrap command on the node I want to be captain (bootstrap captain). Now when I do "splunk show shclusters-status" I can see all three members are up and I don't see the error in the log anymore.

0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

I have installed Search Head Cluster(with 3 SHC members shc62_01 , shc62_02 and shc62_03) and

1) First you will install each of the member as normal splunk instance.

2) Install the Serch Head Cluster Deployer as well.

3) After that you initiate each of these intended Search Head Member for cluster using command like below Enabled instance with name shc62_01 , shc62_02 and shc62_03 for Search head Cluster using command

splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_01_URI>:<shc62_01_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_01>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_02_URI>:<shc62_02_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_02>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_03_URI>:<shc62_03_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_03>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>

2) This command can be executed on any Search Head cluster Member initiated in step 1

splunk bootstrap shcluster-captain -servers_list "https://<shc62_01_URI>:<shc62_01_MgmtPort>  , https://<shc62_02_URI>:<shc62_02_MgmtPort>  ,https://<shc62_03_URI>:<shc62_03_MgmtPort>" -auth admin:<admin_Password>

3)Check status of SHC member using command

./splunk show shcluster-status

This information is also provided at link ----http://docs.splunk.com/Documentation/Splunk/6.2.1/DistSearch/SHCdeploymentoverview

rbal_splunk
Splunk Employee
Splunk Employee

The issue was caused as all the Search Head Cluster members were installed using the same image resulting in same GUID which caused the conflict.

To resolve these issues, we performed the steps below and re-build the environment.

1) Stop all SHC members.
2) Delete GUID entry from $SPLUNK_HOME/etc/instance.cfg on all three Search Head Peers.
3) Clear _raft folder from each SH Cluster ($SPLUNk_HOME/ var/run/splunk/_raft)
4) Removed the Search Head cluster stanza on each peer $SPLUNK_HOME/etc/system/local/server.cfg

Stanza like:

[shclustering]
conf_deploy_fetch_url = https://<DEPLOY_SERVER_URI>:<DEPLOY_SERVER_MANAGMENT_PORT>
disabled = 0
mgmt_uri = https://<SHCM_URI>:<SHC_ MANAGMENT_PORT>
replication_factor = 3
id = 54F906ED-210B-4EB8-89E3-65266711271

5) Restart the Search Head Members.
6) Initialize the Search Head Cluster Member:

splunk init shcluster-config -auth : -mgmt_uri : -replication_port  -replication_factor  -conf_deploy_fetch_url : -secret security_key splunk restart 

7) Bootstrap SH Cluster Members – and it worked.

Sourabhv05
Communicator

Hi rbal_splunk

Could you please help me in running bootstrap command for Sh cluster members Captain declaration. I am able to successfully able to initialize the shcluster as mentioned in docs but while running bootstrap coimmand getts following error " command error : bootstrap is not valid command" . AM i missing something ?

Please help!!!

0 Karma

ppablo
Retired

Hi @Sourabhv05. I have @rbal_splunk next to me and she wanted to ask "Before bootstrapping, did you initiate each search head cluster member?" Also, @rbal_splunk's answer below was intended for you to see if that process will solve your issue.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...