Let's say if I have 4 indexers at one site 'AB' and 4 indexers at another site 'CD'(DR site).
site_replication_factor=origin:2,total:3
site_search_factor=origin:1,total:2
Question :1 I understand from this document that in a situation where 3 of my indexers go down at 'AB' site , my 4th indexer will keep on ingesting the data and would keep copies in reserve state to be distributed when other indexers come back in place? Please confirm.
Question :2 What if all my 4 indexers go down at 'AB' site ..how would ingestion be managed then ? Would cluster master automate the data ingestion to DR site 'CD' indexers ?
Question :3 Since I have site_replication_factor of origin:2, total:3 and let's say two indexer machines at 'AB' site, both holding copy of same bucket goes down. Now, in this situation all copies(two) for a specific bucket become unavailable at site 'AB', then would cluster master instruct to receive a copy from DR site 'CD' and get that copied to 2 running indexers at 'AB' site ?
Answer 1: Confirmed. Sort of. There's no such thing as a "reserve state". Buckets simply won't be replicated until another AB indexer comes on-line.
Answer 2: It depends on how you've set up your outputs.conf files in your environment. If they contain all indexers or use Indexer Discovery then the sending systems will send their data to a surviving indexer. If there are servers configured to send only to site AB then they will buffer data until an AB indexer is available.
Answer 3: Yes, the CM will try to restore the replication and search factors by copying data from the CD site to surviving indexers in the AB site.
/Answer 2: It depends on how you've set up your outputs.conf files in your environment. If they contain all indexers or use Indexer Discovery then the sending systems will send their data to a surviving indexer. If there are servers configured to send only to site AB then they will buffer data until an AB indexer is available./
What all I need to look at ? to know more around this behavior ? is this something which cluster master is going to control ?
Please be noted that we have Active-Active configuration where both the sites receive data from different clients and are acting as DR for each other as well.
Look at the outputs.conf file(s) in your deployment server's deployment-apps
directory. It may also be in your CM tool (Ansible, Puppet, etc.).
Active/active is normal multi-site cluster behavior.
Active-Active in the sense that both the sites will have licensing cost for the clients they will be ingesting for and not only the other site acting as DR.
To be true, all the documents I have gone through on splunk official website does not explain in particular where single CM handles two different active sites that are fulfilling HA & DR requirements along with ingesting their own data. Could you please point me to such online doc which explains it with all the needed settings for .conf files ?
Also , ansible playbook which is converting the two sites into multi-site active-active configuration seems to have below parameters only for server.conf -
'constrain_singlesite_buckets' -- 'false',
'multisite', value -- 'true'
'available_sites' -- 'site1,site2'
'site_replication_factor' -- origin:2, total:3
'site_search_factor' -- origin:1, total:2
'replication_factor' -- value: '1'
'search_factor' -- value: '1'
Seems to be nothing specific under output.conf other than some parameters for forwarding CM data to indexers -
forwardedindex.filter.disable = true
indexAndForward = false
Your outputs.conf file must have a server
setting or a indexerDiscovery
setting.
But that's what you suggested to look at under outputs.conf -
/Your outputs.conf file must have a server setting or a indexerDiscovery setting./
and here is what splunk says around these parameters -
"server = [|]:, [|]:, ...
* A comma-separated list of one or more systems to send data to over a
TCP socket.
* Required if the 'indexerDiscovery' setting is not set.
* Typically used to specify receiving Splunk systems, although you can use
it to send data to non-Splunk systems (see the 'sendCookedData' setting).
* For each system you list, the following information is required:
* The IP address or server name where one or more systems are listening.
* The port on which the syslog server is listening.
indexerDiscovery =
* The name of the master node to use for indexer discovery.
* Instructs the forwarder to fetch the list of indexers from the master node
specified in the corresponding [indexer_discovery:] stanza.
* No default."
My point was outputs.conf does not restrict the SH to search on any specific indexers. Nor does it restrict ingestion of data to any specific site.
outputs.conf for CM , right !!!
Request if you can point me to some online doc that explains it and can explain how these parameters control such mechanism in active-active cluster ..would be really helpful.
outputs.conf not for CM, but for everything else (except indexers). The file is documented in the Admin manual and in $SPLUNK_HOME/etc/system/README/outputs.conf.spec.
I understand that this would restrict SH to search on some site-specific indexers.
But what about restricting ingestion of data to some specific site ?
We are using HEC.
outputs.conf has nothing to do with running searches. Nor does it have anything to do with ingesting data. It merely tells a Splunk instance where to put its data.