I have three geographically separated sites where I am implementing a multisite Splunk Indexer Cluster. The master site will have (1) search head, a (2) clustered indexers, (1) master node, and (1) deployment server. Each of the other two sites will have (1) search head, and a (2) clustered indexers. The main issue I have is ensuring each local indexer cluster indexes only data that is produced in the geographic area in which it is located.
available_sites MUST list all the site names in the multisite cluster as per this: http://docs.splunk.com/Documentation/Splunk/6.2.0/Indexer/Multisiteconffile
To find that I googled "available_sites Splunk".
The answer to your question is in the link under syntax:
configure the site replication factor with the sitereplicationfactor attribute in the master's server.conf file. The attribute resides in the [clustering] stanza, in place of the single-site replication_factor attribute. For example:
mode = master
sitereplicationfactor = origin:2,total:3
sitesearch_factor = origin:1,total:2
this configuration will set buckets to only replicate in the site that they were indexed into. so a forwarding sending data into site1 will not get replicated outside of site1 (seems like this is what you wanted, i just wanted to expand a little more on this setting)
So I am pretty sure this all worked. I made the changes in the Master server.conf file and was able to bring all Splunk services back up. The only problem I am faced with now is the remote index cluster replicating over the wire. It is making painfully slow progress that will eventually lead to full replication at all sites.
I have one question. I have pushed the Splunk to a few clients at my remote site via the deployment server located that is located at my main site. Should those new clients be send data to my remote search head or does the replication have to complete first?
Thanks for the assistance.
Hey those were just examples... if you carefully read the docs you should be able to get rid of replication across the WAN altogether.
forwarders should always send data to indexers, deployment servers should only be used to manage forwarders... so "new clients send data to my remote search head" doesnt make sense to me.
new clients = forwarders.
I have multiple sites that are separated geographically. So when I refer to a remote search head or remote indexer cluster I mean the server instance that is not part of my main site that hosts (in addition to a search head and indexer cluster) a master node and deployment server.