I am automating removing a search head (SH) instance from both the search head cluster and the index cluster. Removing the SH from the shcluster is easy, but removing the SH from the index cluster seems problematic -- assuming it is done via a CLI command and not by restarting the cluster master instance.
The SH was added to the index cluster using the following command from the SH:
./splunk edit cluster-config -auth admin:${PASSWD} -mode searchhead -site ${SITE} -master_uri ${CLUSTERMSTR_URI} -replication_port 8080 -secret ${CLUSTERMSTR_PASSWD}
Also from the SH, am trying to remove the SH from the index cluster using:
./splunk remove cluster-master -master_uri https://splunk-clustermstr-dev.fmrco.com:8089 -secret ${CLUSTER_PWD}
result: In handler 'clustersearchheadconfig': Cannot remove this master.
I also tried to run the following command from the cluster master:
./splunk remove cluster-peers -peers 8C842081-A220-4081-87AC-92915D567845
Result:
In handler 'clustermastercontrol': Remove aborted, Reason: Cannot find peer with guid=8C842081-A220-4081-87AC-92915D567845 to remove.
Wasn't sure why till running ./splunk list cluster-peers and saw that it is only showing indexer entries, not SH entries.
I am hoping not to have to restart the cluster master instance to get rid of removed SH instances -- let me know how please.
Regards.
After reading your post a few times, I was confused first, tbh.
I hope I understood what you want to do.
The SH was added to the index cluster using the following command from the SH: [...]
I don't like the wording of a SH "joining" an indexer cluster because actually, the SH does not join it. It just gets the information from the deployer where to find the information. The SH (standalone or cluster, doesn't matter) does not know where information on peers are stored.
Btw, you are using a multisite cluster? Otherwise the "-site" parameter is not needed. Just asking from curiosity.
Also from the SH, am trying to remove the SH from the index cluster using:
./splunk remove cluster-master -master_uri https://splunk-clustermstr-dev.fmrco.com:8089 -secret ${CLUSTER_PWD}
You have two possibilities here:
a) You could use splunk remove shcluster-member on the search head you want to be removed. I don't know how that would affect a possible search head captain though (if you have not set a static captain).
b) My preferred version, identify the captain, go there and use the following command there splunk remove shcluster-member -mgmt_uri URI:management_port
where URI is the SH you want to remove.
c) Here it depends whether you want to use this instance in the future or not - use splunk stop OR 1. splunk remove shcluster-member
and 2. splunk disable shcluster-config,
for more information, see: http://docs.splunk.com/Documentation/Splunk/6.5.1/DistSearch/Removeaclustermember
After stopping it, it should be removed from the KV store and the SH will no longer have access to the indexers. In my previous tests, the KV store was always updated without any restarts. You should still check it (see above link).
I also tried to run the following command from the cluster master:
./splunk remove cluster-peers -peers 8C842081-A220-4081-87AC-92915D567845
Splunk's "cluster-peers" commands always refer to indexers, never to search heads.
Even though an indexer peer or SH cluster-member will still appear in the DMC after removing it, you could try and see whether the Search Head gets removed after a few minutes. If not, a restart will be necessary.
Btw, what is so bad about restarting your master after removing a server from the cluster? It's fast and doesn't interrupt indexing, so I don't see a problem here. But I'm happy to hear about a reason.
Did I answer your question or did I miss your point?
Edit: Forgot something in c).
After reading your post a few times, I was confused first, tbh.
I hope I understood what you want to do.
The SH was added to the index cluster using the following command from the SH: [...]
I don't like the wording of a SH "joining" an indexer cluster because actually, the SH does not join it. It just gets the information from the deployer where to find the information. The SH (standalone or cluster, doesn't matter) does not know where information on peers are stored.
Btw, you are using a multisite cluster? Otherwise the "-site" parameter is not needed. Just asking from curiosity.
Also from the SH, am trying to remove the SH from the index cluster using:
./splunk remove cluster-master -master_uri https://splunk-clustermstr-dev.fmrco.com:8089 -secret ${CLUSTER_PWD}
You have two possibilities here:
a) You could use splunk remove shcluster-member on the search head you want to be removed. I don't know how that would affect a possible search head captain though (if you have not set a static captain).
b) My preferred version, identify the captain, go there and use the following command there splunk remove shcluster-member -mgmt_uri URI:management_port
where URI is the SH you want to remove.
c) Here it depends whether you want to use this instance in the future or not - use splunk stop OR 1. splunk remove shcluster-member
and 2. splunk disable shcluster-config,
for more information, see: http://docs.splunk.com/Documentation/Splunk/6.5.1/DistSearch/Removeaclustermember
After stopping it, it should be removed from the KV store and the SH will no longer have access to the indexers. In my previous tests, the KV store was always updated without any restarts. You should still check it (see above link).
I also tried to run the following command from the cluster master:
./splunk remove cluster-peers -peers 8C842081-A220-4081-87AC-92915D567845
Splunk's "cluster-peers" commands always refer to indexers, never to search heads.
Even though an indexer peer or SH cluster-member will still appear in the DMC after removing it, you could try and see whether the Search Head gets removed after a few minutes. If not, a restart will be necessary.
Btw, what is so bad about restarting your master after removing a server from the cluster? It's fast and doesn't interrupt indexing, so I don't see a problem here. But I'm happy to hear about a reason.
Did I answer your question or did I miss your point?
Edit: Forgot something in c).
Thank you for the comments skalliger.
To your point, I am currently running the following to remove the SH from the shcluster:
./splunk/bin/splunk remove shcluster-member -auth admin:${PASSWD}
It works well.
What I am trying to do is remove it from showing up in cluster master search heads tab as 'down'; I am willing to restart the cluster master if I have to -- it does work -- it is just really messy. I should be able to make a CLI to remove it from showing up in the search head tab.
thanks again.
Also, we're pretty sure that after some time the entry will simply disappear from the cluster master. So you could remove it by shutting it down and not bother with clearing it from the master.
I had a similar situation recently where this happen. Support suggested the answer above, but it was unfortunately too late as I had already removed core splunk. Some time passed and was at a point where I needed to bounce the splunkd service. After the service came back online I noticed that the SH was removed from the Search Heads Tab of the Indexer Clustering page.