Deployment Architecture

What is the best recommended configuration for site failures in a multisite indexer cluster?

lukasz92
Communicator

Hi,

I decided to work on 2-site cluster with two indexers on first site, and two indexers on second site.
Search head on site1 is (in configuration) set to site0, and forwarders are site unaware (however using indexer discovery).
There is also a search head on site2.

Cluster Master is on the second site.
Replication Factor and Search Factor is set to origin 2 total 3.

What solution do you recommend for site failures - like the entire site2 is down (including 2 indexers, and cluster master)?
I need to have access to all indexed data.

EDIT: I assume that during failure, all nodes in the second site operate correctly.

0 Karma

lguinn2
Legend

If site2 goes down, including the cluster master, the surviving search head(s) can still search site1, even if the cluster master if offline.
However, they will search using the "last known" information, which might not be good.

So I would do 2 things:

First, I would set the site1 search head to site1, not site0. And the site2 search head to site2, not site0. Why? Because I want the "last known" information to always be for the local site. That way, if the other site goes down, the search head will still be able to search. Using site0 means that the "last known" information could contain indexers/buckets from any site - not just the search head's local site.

Second, I would have a backup cluster master available, on site1. If the cluster master goes down, I want to start the backup cluster master as soon as possible. This will keep the cluster up to date for both the peers and the search heads. This is particularly important for longer outages.

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...