I am performing migration of a multi site indexer cluster with 2 sites. RF=2, SF=2 with 1 copy of raw data and tsidx data in each site. Total 40 indexers with 20 indexers each per site.
Approach is as follows:
I am currently at step 5, problem is that offlining each indexer takes couple of hours. I am aware that lot of factors including not the least of which are hardware bound and the amount of data (~900T in total) plays a significant role here. Nevertheless I would like to know if there are still improvements that can be made here through Splunk configuration changes.
Appreciate your thoughts,
Splunk 7.0 has some new features that might help in this area however beyond having faster I/O and/or faster servers I am unsure if there are any tweaks you can do to improve this...
You can adjust the number of buckets worked on by a peer on the Mast Node. In your configurations you can add the following:
[clustering] max_peer_build_load = <integer> * This is the maximum number of concurrent tasks to make buckets searchable that can be assigned to a peer. * Defaults to 2. max_peer_rep_load = <integer> * This is the maximum number of concurrent non-streaming replications that a peer can take part in as a target. * Defaults to 5. max_peer_sum_rep_load = <integer> * This is the maximum number of concurrent summary replications that a peer can take part in as either a target or source. * Defaults to 5.
Provided you have the hardware to handle the additional CPU, memory and disk load these values can be safely increased. Not knowing what your environment is, I'd recommend some caution and increase the settings in small increments while monitoring load on your Indexers.
Additionally, these settings can be modified in memory only (As in, run time & not saved to config) with the following commands (No restart required!). Perform these on the Master Node:
splunk edit cluster-config -max_peer_build_load 4 splunk edit cluster-config -max_peer_rep_load 10