Hello dear Splunk experts,
I like to understand a "re-connection hickup":
One of my indexers from the index cluster needed a timeout. So I used
~/bin/splunk offline --decommission_node_force_timeout 3600
to take it (kind of gracefully) offline. The cluster master shows "Restarting" (~/bin/splunk show cluster-status --verbose) so I started working on it and later rebooted the machine, Splunk started on the indexer and then.... nothing - The cluster master showed still "Restarting" for over 10 minutes.
So I decided to login to the web view of the indexer, navigated to the settings / peer stuff and about 20 seconds later the cluster master cluster-status showed it as "status UP" without changing anything.
Some days later I did the same with another indexer and it was the same story - I needed to login to the web view in order to have the peer shown as UP in the cluster.
Is there anything in the docs that I have missed? Is this normal? (How can I trust the self-healing capabilities of the cluster, if one needs to manually log in to the peer after a downtime?)
Thanks a lot + kind regards, Triv