Deployment Architecture

Search head had bad DNS entry - Now can't delete it from the cluster

New Member

There was an extra incorrect A record in DNS for one of my search heads that I am building. As a result when I tried to elect a captain the wrong name was coming back. I have had the network team correct DNS but now I can't seem to get the Cluster master to see the search head as the correct name.

on the cluster master the name is showing up as myserver-D.mydomain.net

When I try to elect a captain I get the below error: (the correct name should be https://myserver-A:8089 )

04-05-2019 11:52:51.054 -0400 ERROR SHCRaftConsensus - failed appendEntriesRequest err: uri=https://myserver-C:8089/services/shcluster/member/consensus/pseudoid/raft_append_entries?output_mode..., error=400 - Mismatch in mgmturi and server URI provided to LEADER. Check URI strings in setconfiguration mgmturi = https://myserver-A:8089 remoteserver_name =

When I look at the cluster master the server is showing up in the Search Head list as the incorrect myserver-D.mydomain.net name. Can anyone tell me how to fix this? Where to go delete or remove and correct this on the cluster master.

Labels (3)
0 Karma
1 Solution

Motivator

To remove the node from the SHC, perform the steps below on that node (while Splunk is running):

  1. Remove the member:
    splunk remove shcluster-member

  2. Disable the member:
    splunk disable shcluster-config

  3. Clean the KVStore:
    splunk clean kvstore --cluster

If you want to re-add this member, I would again verify your DNS entry (check for duplicate records and check /etc/hosts if Linux).
Then follow this steps to add the member back into the cluster:

  1. Execute these commands in sequence on the problem node:
    splunk stop
    splunk clean all
    splunk start

  2. Re-initialize the node:
    splunk init shcluster-config -auth : -mgmturi : -replicationport -replicationfactor -confdeployfetchurl : -secret -shcluster_label
    splunk restart

Additional documentation can be found here: https://docs.splunk.com/Documentation/Splunk/7.2.5/DistSearch/Addaclustermember#Add_a_member_that_wa...

View solution in original post

0 Karma

Motivator

To remove the node from the SHC, perform the steps below on that node (while Splunk is running):

  1. Remove the member:
    splunk remove shcluster-member

  2. Disable the member:
    splunk disable shcluster-config

  3. Clean the KVStore:
    splunk clean kvstore --cluster

If you want to re-add this member, I would again verify your DNS entry (check for duplicate records and check /etc/hosts if Linux).
Then follow this steps to add the member back into the cluster:

  1. Execute these commands in sequence on the problem node:
    splunk stop
    splunk clean all
    splunk start

  2. Re-initialize the node:
    splunk init shcluster-config -auth : -mgmturi : -replicationport -replicationfactor -confdeployfetchurl : -secret -shcluster_label
    splunk restart

Additional documentation can be found here: https://docs.splunk.com/Documentation/Splunk/7.2.5/DistSearch/Addaclustermember#Add_a_member_that_wa...

View solution in original post

0 Karma

New Member

I went through this procedure on all three of my search heads.

I even found another post with a couple of extra steps in it from this one.

https://answers.splunk.com/answers/210634/how-to-troubleshoot-search-head-clustering-initial.html

I was able to get all the search heads to show up a second timeeach with different GUIDS. The incorrect name kept showing up even though the bad entry was removed from DNS and I had flushed the DNS cache on all my servers just to be sure. I believe there must be a setting in a config file somewhere on the master node that was not being over written.

I blew away my cluster master and started over from scratch and everything worked as it should.

0 Karma

Motivator

If you found the answer helpful, please consider accepting it so that it can help others in the future.

0 Karma