We are deploying a Splunk High Availability Cluster in AWS, where we have one master node, one search head and 3 peer nodes. I like to know how to provide HA for the Master and Search node. Can AMI backup of the running master be the best option? Please suggest your views.
Here is a .conf2015 talk that my colleagues and I did on deploying a highly available Splunk Enterprise architecture on AWS. We talk about how to leverage autoscaling with the master node since it is a stateless server as mentioned in the above answer.
For master node, since it's not storing data or not doing searches - If a master goes down, the cluster can continue to run as usual, as long as there are no other failures. Peers can continue to ingest data, stream copies to other peers, replicate buckets, and respond to search requests from the search head. An active-passive or stand-by set up is sufficient for master.