Deployment Architecture

Splunk Cluster Master Server Fault Tolerance

micatcloudon
Engager

Greetings!
While the fault tolerance for the Splunk Cluster node is clear.
I wonder How Splunk Cluster would recover from the Splunk Master server loss ?
What is the remedy procedure ?

Tags (1)
0 Karma

Drainy
Champion

Essentially you can start up a new master node with the same IP/DNS details and it will assume the role of the original as the other answer aludes to, however the truth is that the master itself has no HA or fault tolerance.

If you were to experience an event which took a master and an indexer offline you would be left in an inconsistant state, as per Starfleets situation here; http://splunk-base.splunk.com/answers/65397/splunk-v5-clustering-and-ha

Its still a much improved HA solution and its pretty good to be fair, but there really needs to be better tolerance for the master failing 🙂

roychen
Path Finder

Hi,

You simply need to restore the master node, and the peers in the cluster will communicate with the master regarding the current state of the cluster (number of replicated copies / searchable copies, etc).

To make sure you can restore a master node ASAP, you might want to keep a standby master node, as per the instructions in:

http://docs.splunk.com/Documentation/Splunk/5.0.1/Indexer/Configurethemaster#Configure_a_stand-by_ma...

Hope this helps.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...