Is anybody else having issues with bucket replication after applying a bundle on a cluster?
It seems that some indexes do not recover properly from the restart. Some of the indexes only have 1 searchable copy. SF=2 RF=2. Sometimes I get a message or two about a bucket in the pending discard state.
Is there something else that needs to be done before applying the bundle to prevent this from happening?
My little experience so far with a 20 node windows cluster is it seems hit-or-miss.
I tend to only apply bundle off hours due to the fact it takes me about 40 minutes to complete the rolling-restarts. Once the rolling-restart is done (Guessing)...I have to watch out for nodes that do not come back up.
After the restart my cluster will spend another couple minutes cleaning up the state of indexes. Ensuring that the events are properly replicated.
I guess it really comes down to how much new data is coming into your cluster when you are applying the bundle. That data will need to be replicated and if you have nodes going down it will take some time. Splunk will replicate the data but it just takes time.
The only item I recommend running before cluster bundle is checking the config to ensure no typos.