Would it be possible to bring the new servers online into the respective pools and have them sync in such a way that we can then remove the old servers one at a time. If we can, which is the best order in which do this?
What would be the best way to go about this transition?
When we add new indexers, we add them in, rebalance, then take the old ones out and rebalance again. It hasn't failed us yet. Do you need details on doing something like this?
When we add new indexers, we add them in, rebalance, then take the old ones out and rebalance again. It hasn't failed us yet. Do you need details on doing something like this?
Details would be wonderful, thank you.
Rebalancing
From the docs, on the cluster master:
splunk rebalance cluster-data -action start [-index index_name] [-max_runtime interval_in_minutes]
Which I just do:
splunk rebalance cluster-data -action start
because I want all the indexes rebalanced and the default runtime is usually enough. This may take a while, but in most cases where we are doing this, it takes less than 30 minutes.
Removing the indexer (search peer)
Caution: The following are from notes that I have in my administration notes. I'm pretty sure they are correct, but if not, I'm not going to take any responsibility for problems that may arise. I have not had any bad effects on my environment, so I think you should be safe.
Take the indexer offline by running the following on the indexer to be removed:
splunk offline --enforce-counts
Then you will have to wait for the status on the CM (cluster master) to show that all is well through the Monitoring Console. All data should be searchable, the search factor met and the replication factor met across all indexers.
To see the status of the cluster and get the for the next step, on the CM run:
splunk show cluster-status
Run this command on the CM for the Indexer you are decommissioning, once it is finally offline. This will keep it from joining the cluster if it comes back on line.
splunk remove cluster-peers -peers <guid>
Only remove one search peer at a time, then make sure the cluster is all okay (all the indexes are fully replicated and searchable) before you remove a second one.
So the steps are: