Archive
Highlighted

Multi-Site Cluster | Failure Tolerance

Loves-to-Learn Lots

Let's say if I have 4 indexers at one site 'AB' and 4 indexers at another site 'CD'(DR site).
sitereplicationfactor=origin:2,total:3
sitesearchfactor=origin:1,total:2

Question :1 I understand from this document that in a situation where 3 of my indexers go down at 'AB' site , my 4th indexer will keep on ingesting the data and would keep copies in reserve state to be distributed when other indexers come back in place? Please confirm.

Question :2 What if all my 4 indexers go down at 'AB' site ..how would ingestion be managed then ? Would cluster master automate the data ingestion to DR site 'CD' indexers ?

Question :3 Since I have sitereplicationfactor of origin:2, total:3 and let's say two indexer machines at 'AB' site, both holding copy of same bucket goes down. Now, in this situation all copies(two) for a specific bucket become unavailable at site 'AB', then would cluster master instruct to receive a copy from DR site 'CD' and get that copied to 2 running indexers at 'AB' site ?

0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

SplunkTrust
SplunkTrust

Answer 1: Confirmed. Sort of. There's no such thing as a "reserve state". Buckets simply won't be replicated until another AB indexer comes on-line.

Answer 2: It depends on how you've set up your outputs.conf files in your environment. If they contain all indexers or use Indexer Discovery then the sending systems will send their data to a surviving indexer. If there are servers configured to send only to site AB then they will buffer data until an AB indexer is available.

Answer 3: Yes, the CM will try to restore the replication and search factors by copying data from the CD site to surviving indexers in the AB site.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

Loves-to-Learn Lots

/Answer 2: It depends on how you've set up your outputs.conf files in your environment. If they contain all indexers or use Indexer Discovery then the sending systems will send their data to a surviving indexer. If there are servers configured to send only to site AB then they will buffer data until an AB indexer is available./

What all I need to look at ? to know more around this behavior ? is this something which cluster master is going to control ?

Please be noted that we have Active-Active configuration where both the sites receive data from different clients and are acting as DR for each other as well.

0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

SplunkTrust
SplunkTrust

Look at the outputs.conf file(s) in your deployment server's deployment-apps directory. It may also be in your CM tool (Ansible, Puppet, etc.).

Active/active is normal multi-site cluster behavior.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

Loves-to-Learn Lots

Active-Active in the sense that both the sites will have licensing cost for the clients they will be ingesting for and not only the other site acting as DR.
To be true, all the documents I have gone through on splunk official website does not explain in particular where single CM handles two different active sites that are fulfilling HA & DR requirements along with ingesting their own data. Could you please point me to such online doc which explains it with all the needed settings for .conf files ?

0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

Loves-to-Learn Lots

Also , ansible playbook which is converting the two sites into multi-site active-active configuration seems to have below parameters only for server.conf -
'constrainsinglesitebuckets' -- 'false',
'multisite', value -- 'true'
'availablesites' -- 'site1,site2'
'site
replicationfactor' -- origin:2, total:3
'site
searchfactor' -- origin:1, total:2
'replication
factor' -- value: '1'
'search_factor' -- value: '1'

Seems to be nothing specific under output.conf other than some parameters for forwarding CM data to indexers -
forwardedindex.filter.disable = true
indexAndForward = false

0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

SplunkTrust
SplunkTrust

Your outputs.conf file must have a server setting or a indexerDiscovery setting.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

Loves-to-Learn Lots

outputs.conf for CM , right !!!
Request if you can point me to some online doc that explains it and can explain how these parameters control such mechanism in active-active cluster ..would be really helpful.

0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

SplunkTrust
SplunkTrust

outputs.conf not for CM, but for everything else (except indexers). The file is documented in the Admin manual and in $SPLUNK_HOME/etc/system/README/outputs.conf.spec.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Multi-Site Cluster | Failure Tolerance

Loves-to-Learn Lots

I understand that this would restrict SH to search on some site-specific indexers.

But what about restricting ingestion of data to some specific site ?
We are using HEC.

0 Karma