Deployment Architecture

Remove peer from cluster then re-add

jcarpio9
Engager

Hello all:

Need to remove a peer from cluster due to sync errors. Then re-add to sync up correctly (per support). It's the only peer to the master with a replication factor of 2.

Is splunk offline or splunk offline --enforce-counts the appropriate way to remove the peer?

The following article seems to suggest that the peer won't shutdown (but will it be removed?) due to the number of peers < the replication factor.

http://docs.splunk.com/Documentation/Splunk/5.0.5/Indexer/Takeapeeroffline#Take_a_peer_down_permanen...

Thanks.

Tags (1)
0 Karma
1 Solution

svasan_splunk
Splunk Employee
Splunk Employee

Since you plan to bring back the peer don't use offline --enforce-counts. That is to completely remove the peer from the cluster and the master then makes extra copies to meet RF/SF. That's not what you want for this. (The peer doesn't shutdown until RF and SF are met. So if there aren't enough other peers to meet RF/SF then this peer won't shutdown)

If your cluster is currently searchable and you want to try to preserve that, use offline. This might take a bit of time ( a few minutes). It waits for buckets to become primary somewhere else and also for any ongoing searches to finish (with a timeout: it still forces shutdown after some minutes IIRC). If you planning to do some maintenance work before bringing back the peer, make sure to set restart_timeout appropriately on the master. It defaults to 10 mts I think.

If your cluster is already in a weird state and you are just going to stop the peer and bring it back up, the simplest may be to just do splunk stop on the peer. Though, the master will initiate fixup to meet RF/SF after a 60s timeout.

View solution in original post

rbal_splunk
Splunk Employee
Splunk Employee

Starting Splunk Version 6.2 and above we have CLI command to remove Cluster Peer/Clustered Indexer .
Refer :::

http://docs.splunk.com/Documentation/Splunk/6.2.1/Indexer/Removepeerfrommasterlist

0 Karma

svasan_splunk
Splunk Employee
Splunk Employee

Since you plan to bring back the peer don't use offline --enforce-counts. That is to completely remove the peer from the cluster and the master then makes extra copies to meet RF/SF. That's not what you want for this. (The peer doesn't shutdown until RF and SF are met. So if there aren't enough other peers to meet RF/SF then this peer won't shutdown)

If your cluster is currently searchable and you want to try to preserve that, use offline. This might take a bit of time ( a few minutes). It waits for buckets to become primary somewhere else and also for any ongoing searches to finish (with a timeout: it still forces shutdown after some minutes IIRC). If you planning to do some maintenance work before bringing back the peer, make sure to set restart_timeout appropriately on the master. It defaults to 10 mts I think.

If your cluster is already in a weird state and you are just going to stop the peer and bring it back up, the simplest may be to just do splunk stop on the peer. Though, the master will initiate fixup to meet RF/SF after a 60s timeout.

Get Updates on the Splunk Community!

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...