Getting Data In

How to force local indexing in each site of a multisite indexer cluster?

thomas_forbes
Communicator

I have three geographically separated sites where I am implementing a multisite Splunk Indexer Cluster. The master site will have (1) search head, a (2) clustered indexers, (1) master node, and (1) deployment server. Each of the other two sites will have (1) search head, and a (2) clustered indexers. The main issue I have is ensuring each local indexer cluster indexes only data that is produced in the geographic area in which it is located.

0 Karma
1 Solution

jkat54
SplunkTrust
SplunkTrust

http://docs.splunk.com/Documentation/Splunk/6.2.0/Indexer/Sitereplicationfactor

Make your origin = your total like this:

site_replication_factor = origin:2,total:2
site_search_factor = origin:2,total:2

View solution in original post

jkat54
SplunkTrust
SplunkTrust

http://docs.splunk.com/Documentation/Splunk/6.2.0/Indexer/Sitereplicationfactor

Make your origin = your total like this:

site_replication_factor = origin:2,total:2
site_search_factor = origin:2,total:2

View solution in original post

dxu_splunk
Splunk Employee
Splunk Employee

this configuration will set buckets to only replicate in the site that they were indexed into. so a forwarding sending data into site1 will not get replicated outside of site1 (seems like this is what you wanted, i just wanted to expand a little more on this setting)

0 Karma

thomas_forbes
Communicator

I am aware of this. I have created a separate server class to hold all of my site specific apps/conf files and one of those files (the outputs.conf file) contains the relevant indexers that pertain to the remote site.

0 Karma

thomas_forbes
Communicator

So I am pretty sure this all worked. I made the changes in the Master server.conf file and was able to bring all Splunk services back up. The only problem I am faced with now is the remote index cluster replicating over the wire. It is making painfully slow progress that will eventually lead to full replication at all sites.

I have one question. I have pushed the Splunk to a few clients at my remote site via the deployment server located that is located at my main site. Should those new clients be send data to my remote search head or does the replication have to complete first?

Thanks for the assistance.

Tom Forbes

0 Karma

jkat54
SplunkTrust
SplunkTrust

Hey those were just examples... if you carefully read the docs you should be able to get rid of replication across the WAN altogether.

forwarders should always send data to indexers, deployment servers should only be used to manage forwarders... so "new clients send data to my remote search head" doesnt make sense to me.

0 Karma

thomas_forbes
Communicator

new clients = forwarders.

I have multiple sites that are separated geographically. So when I refer to a remote search head or remote indexer cluster I mean the server instance that is not part of my main site that hosts (in addition to a search head and indexer cluster) a master node and deployment server.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Depends on what you put in the outputs.conf on the remote forwarders. If you're having them send to indexers in your main site, search heads in your main site should see the data as soon as it arrives. Search heads in the remote site wouldn't see the data at all because you wanted to keep data in the origin site only and in this case the origin will be the main site.

0 Karma

jkat54
SplunkTrust
SplunkTrust

The origin is where the data is first indexed if that helps.

0 Karma

thomas_forbes
Communicator

Do I make this change to my master only?

0 Karma

jkat54
SplunkTrust
SplunkTrust

The answer to your question is in the link under syntax:

configure the site replication factor with the site_replication_factor attribute in the master's server.conf file. The attribute resides in the [clustering] stanza, in place of the single-site replication_factor attribute. For example:

[clustering]
mode = master
multisite=true
available_sites=site1,site2
site_replication_factor = origin:2,total:3
site_search_factor = origin:1,total:2

0 Karma

thomas_forbes
Communicator

What about the "available_sites" value?

0 Karma

jkat54
SplunkTrust
SplunkTrust

available_sites MUST list all the site names in the multisite cluster as per this: http://docs.splunk.com/Documentation/Splunk/6.2.0/Indexer/Multisiteconffile

To find that I googled "available_sites Splunk".

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!