Deployment Architecture

Remove peer from cluster then re-add

jcarpio9
Engager

Hello all:

Need to remove a peer from cluster due to sync errors. Then re-add to sync up correctly (per support). It's the only peer to the master with a replication factor of 2.

Is splunk offline or splunk offline --enforce-counts the appropriate way to remove the peer?

The following article seems to suggest that the peer won't shutdown (but will it be removed?) due to the number of peers < the replication factor.

http://docs.splunk.com/Documentation/Splunk/5.0.5/Indexer/Takeapeeroffline#Take_a_peer_down_permanen...

Thanks.

Tags (1)
0 Karma
1 Solution

svasan_splunk
Splunk Employee
Splunk Employee

Since you plan to bring back the peer don't use offline --enforce-counts. That is to completely remove the peer from the cluster and the master then makes extra copies to meet RF/SF. That's not what you want for this. (The peer doesn't shutdown until RF and SF are met. So if there aren't enough other peers to meet RF/SF then this peer won't shutdown)

If your cluster is currently searchable and you want to try to preserve that, use offline. This might take a bit of time ( a few minutes). It waits for buckets to become primary somewhere else and also for any ongoing searches to finish (with a timeout: it still forces shutdown after some minutes IIRC). If you planning to do some maintenance work before bringing back the peer, make sure to set restart_timeout appropriately on the master. It defaults to 10 mts I think.

If your cluster is already in a weird state and you are just going to stop the peer and bring it back up, the simplest may be to just do splunk stop on the peer. Though, the master will initiate fixup to meet RF/SF after a 60s timeout.

View solution in original post

rbal_splunk
Splunk Employee
Splunk Employee

Starting Splunk Version 6.2 and above we have CLI command to remove Cluster Peer/Clustered Indexer .
Refer :::

http://docs.splunk.com/Documentation/Splunk/6.2.1/Indexer/Removepeerfrommasterlist

0 Karma

svasan_splunk
Splunk Employee
Splunk Employee

Since you plan to bring back the peer don't use offline --enforce-counts. That is to completely remove the peer from the cluster and the master then makes extra copies to meet RF/SF. That's not what you want for this. (The peer doesn't shutdown until RF and SF are met. So if there aren't enough other peers to meet RF/SF then this peer won't shutdown)

If your cluster is currently searchable and you want to try to preserve that, use offline. This might take a bit of time ( a few minutes). It waits for buckets to become primary somewhere else and also for any ongoing searches to finish (with a timeout: it still forces shutdown after some minutes IIRC). If you planning to do some maintenance work before bringing back the peer, make sure to set restart_timeout appropriately on the master. It defaults to 10 mts I think.

If your cluster is already in a weird state and you are just going to stop the peer and bring it back up, the simplest may be to just do splunk stop on the peer. Though, the master will initiate fixup to meet RF/SF after a 60s timeout.

Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...