Deployment Architecture

How to troubleshoot Search Head Clustering initial bootstrap failing with error "found different peer with serverName and hostport already registered and UP"?

rbal_splunk
Splunk Employee
Splunk Employee

I am trying to make three Search Head Cluster Member Bootstrap and it is failing with the following error:

You were unable to bootstrap the 3 Search Head Cluster Member and they were reporting 

I have gone through the setup of the search head cluster from scratch several times now, and I am getting the following error:

01-23-2015 00:44:46.543 +0000 ERROR SHPoolMasterPeerHandler - Cannot add peer=10.0.0.101 mgmtport=8089 (reason: removeOldPeer peer=D171EF5C-6D34-4320-B7D1-DA68F39561DD, serverName=ip-10-0-0-101, hostport=10.0.0.101:8089, but found different peer=D171EF5C-6D34-4320-B7D1-DA68F39561DD with serverName=ip-10-0-0-102 and hostport=10.0.0.102:8089 already registered and UP
1 Solution

rbal_splunk
Splunk Employee
Splunk Employee

The issue was caused as all the Search Head Cluster members were installed using the same image resulting in same GUID which caused the conflict.

To resolve these issues, we performed the steps below and re-build the environment.

1) Stop all SHC members.
2) Delete GUID entry from $SPLUNK_HOME/etc/instance.cfg on all three Search Head Peers.
3) Clear _raft folder from each SH Cluster ($SPLUNk_HOME/ var/run/splunk/_raft)
4) Removed the Search Head cluster stanza on each peer $SPLUNK_HOME/etc/system/local/server.cfg

Stanza like:

[shclustering]
conf_deploy_fetch_url = https://<DEPLOY_SERVER_URI>:<DEPLOY_SERVER_MANAGMENT_PORT>
disabled = 0
mgmt_uri = https://<SHCM_URI>:<SHC_ MANAGMENT_PORT>
replication_factor = 3
id = 54F906ED-210B-4EB8-89E3-65266711271

5) Restart the Search Head Members.
6) Initialize the Search Head Cluster Member:

splunk init shcluster-config -auth : -mgmt_uri : -replication_port  -replication_factor  -conf_deploy_fetch_url : -secret security_key splunk restart 

7) Bootstrap SH Cluster Members – and it worked.

View solution in original post

ghendrey_splunk
Splunk Employee
Splunk Employee

I was trying to install a 3 member shc on a single box (my laptop). I had not set distinct "servername" for each member in etc/local/server.conf. I did "splunk clean all" on all members, set a distinct servername in server.conf for each, restarted each node, and reran the bootstrap command on the node I want to be captain (bootstrap captain). Now when I do "splunk show shclusters-status" I can see all three members are up and I don't see the error in the log anymore.

0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

I have installed Search Head Cluster(with 3 SHC members shc62_01 , shc62_02 and shc62_03) and

1) First you will install each of the member as normal splunk instance.

2) Install the Serch Head Cluster Deployer as well.

3) After that you initiate each of these intended Search Head Member for cluster using command like below Enabled instance with name shc62_01 , shc62_02 and shc62_03 for Search head Cluster using command

splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_01_URI>:<shc62_01_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_01>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_02_URI>:<shc62_02_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_02>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>
splunk init shcluster-config -auth admin:<password> -mgmt_uri https://<shc62_03_URI>:<shc62_03_MgmtPort> -replication_port <REPLICATION_PORT_FOR_shc62_03>0 -replication_factor 3 -conf_deploy_fetch_url https://<Deployer_URI>:<deployer_MgmtPort>

2) This command can be executed on any Search Head cluster Member initiated in step 1

splunk bootstrap shcluster-captain -servers_list "https://<shc62_01_URI>:<shc62_01_MgmtPort>  , https://<shc62_02_URI>:<shc62_02_MgmtPort>  ,https://<shc62_03_URI>:<shc62_03_MgmtPort>" -auth admin:<admin_Password>

3)Check status of SHC member using command

./splunk show shcluster-status

This information is also provided at link ----http://docs.splunk.com/Documentation/Splunk/6.2.1/DistSearch/SHCdeploymentoverview

rbal_splunk
Splunk Employee
Splunk Employee

The issue was caused as all the Search Head Cluster members were installed using the same image resulting in same GUID which caused the conflict.

To resolve these issues, we performed the steps below and re-build the environment.

1) Stop all SHC members.
2) Delete GUID entry from $SPLUNK_HOME/etc/instance.cfg on all three Search Head Peers.
3) Clear _raft folder from each SH Cluster ($SPLUNk_HOME/ var/run/splunk/_raft)
4) Removed the Search Head cluster stanza on each peer $SPLUNK_HOME/etc/system/local/server.cfg

Stanza like:

[shclustering]
conf_deploy_fetch_url = https://<DEPLOY_SERVER_URI>:<DEPLOY_SERVER_MANAGMENT_PORT>
disabled = 0
mgmt_uri = https://<SHCM_URI>:<SHC_ MANAGMENT_PORT>
replication_factor = 3
id = 54F906ED-210B-4EB8-89E3-65266711271

5) Restart the Search Head Members.
6) Initialize the Search Head Cluster Member:

splunk init shcluster-config -auth : -mgmt_uri : -replication_port  -replication_factor  -conf_deploy_fetch_url : -secret security_key splunk restart 

7) Bootstrap SH Cluster Members – and it worked.

Sourabhv05
Communicator

Hi rbal_splunk

Could you please help me in running bootstrap command for Sh cluster members Captain declaration. I am able to successfully able to initialize the shcluster as mentioned in docs but while running bootstrap coimmand getts following error " command error : bootstrap is not valid command" . AM i missing something ?

Please help!!!

0 Karma

ppablo
Retired

Hi @Sourabhv05. I have @rbal_splunk next to me and she wanted to ask "Before bootstrapping, did you initiate each search head cluster member?" Also, @rbal_splunk's answer below was intended for you to see if that process will solve your issue.

0 Karma
Get Updates on the Splunk Community!

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...

Splunk Classroom Chronicles: Training Tales and Testimonials

Welcome to the "Splunk Classroom Chronicles" series, created to help curious, career-minded learners get ...

Access Tokens Page - New & Improved

Splunk Observability Cloud recently launched an improved design for the access tokens page for better ...