we're about to deploy a new multi-site Splunk infrastructure.
We're talking of 4 SHs, 40 IDXs, 10 HFWs and 2 cluster (and eventually license) masters (active-passive), divided in 2 sites. Rep and search factors is 3.
Our idea is to propose our customer automatic procedures to recovery from every faulty situation. Particular attention is given to cluster masters. We would like to avoid having 1 SPOF in the cluster master role.
We thought of 2 possibilities, having an active-passive configuration:
Which one could be best in a multi-site environment?
In which flaws/possible issues could we incur?
I´m planning to build a Master HA scenario using Linux HA daemon and rsync of NFS between two nodes.
If your two sites network topology accepts vlan extensions that will work but no need of DNS pointing to
both since all indexers must be on the same master node at the same time.
You can have the IP, which all peers and search heads connects, managed by the HA daemon as an alias
that needs to be Up only before running splunkd on the active Master node.
That should require no change at peers or search heads.
Curious to hear if you ever got around to implementing your linux ha daemon/rsync idea for cluster masters in a multi-site topology. So far I haven't found any alternatives and wanting to heard how this worked out for you guys over there, thanks.