Deployment Architecture

Distributed Deployment: Splunk data replication

MHibbin
Influencer

Hi,

I have a question with a distributed deployment...

If a deployment was set-up to have for example:

  • 2 x Indexers
  • n x Forwarders (set-up to autoLb between the indexers)
  • 1 x Search Head

The autoLB will forward data to the Splunk indexers in a cycle based on time.What happens to the visibility of data if one of the Indexers was to become inactive (e.g. a system failure, etc). I would imagine that Splunk would be able to view ~half of the data, is this assumption correct?

How would data replication between the Indexers take place? - If the there is a requirement for the data to remain 100% visible, what would be best to achieve this?

I'm sure I have come across guidelines on data replication between two indexers in past notes/discussions/Splunk documentation. But I am not able to find the justification I require.

Are there any thoughts on documentation or sources of information that would be useful?

Any thoughts welcome, thanks in advance.

Regards,

MHibbin

1 Solution

gkanapathy
Splunk Employee
Splunk Employee

You are correct that if one indexer is out, only half the data will be visible, though the search head will report that it is unable to reach all indexers.

In the current version, there is no native replication of data. You will have to do this either using the underlying storage to replicate, or by forwarding from indexer to a replica instance. Both have disadvantages relative to the other. In addition, there is no built-in mechanism for failover, so you would have to implement this yourself. These solutions are not entirely simple to implement correctly and robustly. An overview of this is here: http://docs.splunk.com/Documentation/Splunk/4.3.2/Installation/Highavailabilityreferencearchitecture

In future versions, you may expect some form of built-in replication, as well as a more automated built-in failover, that should be preferable to these other methods.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

You are correct that if one indexer is out, only half the data will be visible, though the search head will report that it is unable to reach all indexers.

In the current version, there is no native replication of data. You will have to do this either using the underlying storage to replicate, or by forwarding from indexer to a replica instance. Both have disadvantages relative to the other. In addition, there is no built-in mechanism for failover, so you would have to implement this yourself. These solutions are not entirely simple to implement correctly and robustly. An overview of this is here: http://docs.splunk.com/Documentation/Splunk/4.3.2/Installation/Highavailabilityreferencearchitecture

In future versions, you may expect some form of built-in replication, as well as a more automated built-in failover, that should be preferable to these other methods.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...