I have a Splunk deployment with two Splunk Receivers and a cluster master operating together as a indexer cluster. I am looking to do some Windows OS updates to all the machines. In preparation for this i have been looking online at some of the Splunk doco. I have read doco about putting a cluster in maintenance mode (see Ref 1) as well as doco about taking a peer offline (see Ref 2)
Is it best to first put the cluster into Maintenance Mode and then put one of the Indexers into "offline" mode and then perform the necessary updates to the offline indexer? Would i be correct in assuming that whilst in Maintenance Mode, Indexers (that are still online) will still be able to successfully ingest data (it just won't roll hot buckets during this time)? Since I'm not sure how long the updates will take, is there a maximum restart_timeout value that can be assigned to the cluster when taking a peer offline? When performing updates on a cluster master, is it simply necessary to put the Cluster Master in Maintenance Mode?
Before taking peer offline you will have to push the cluster to maintenance mode. https://docs.splunk.com/Documentation/Splunk/7.2.4/Indexer/Usemaintenancemode
Data ingestion will happen when the cluster is in Maintenance mode, the only thing that will not happen is replication, which Splunk will take care after the cluster is stable automatically.
I am not sure on restart_timeout value
Situations that can generate an unacceptable number of small buckets include persistent network problems or repeated off-lining of peers may occur hence switching the cluster to maintenance mode is recommended.
Note: Peers are per default only configured to be offline for 60s. Guess you can change this setting in a conf file. When taking single peers down, make sure the replication and search factor is still met.