Deployment Architecture

Is it possible to restrict index replication from specific sites?

Path Finder

Environment with:
- 6 sites
- 4 sites with peer nodes
- 2 sites, 'alpha' and 'bravo' with only search heads

Is it possible to restrict index replication from occurring at sites 'alpha' and 'bravo'?
-OR-
Will this restriction occur automatically, due to sites 'alpha' and 'bravo' not having any peer nodes?

0 Karma
1 Solution

Splunk Employee
Splunk Employee

that will be fine, as long as your sitereplicationfactor/sitesearchfactor allow for it. (if you have something like origin:1, total:6, we'll try to put 1 copy in each site, which then no longer meets RF/SF since theres no peers on Alpha/Bravo. So, origin:1, total:4 would be okay, as would origin:2,total:3)

make sure to call ./splunk set indexing-ready on the Cluster Master everytime it restarts, otherwise it'll be waiting for peers from Alpha/Bravo to join before it starts scheduling cluster activities.

View solution in original post

Legend

You don't even need to tell Splunk that sites Alpha and Bravo exist. You can have every server at one physical site or servers across 20 physical sites - it makes no difference. You get to define the sites, and assign servers to sites, using any scheme that you want in Splunk clustering.

In a cluster, every indexer must have a site specified so that it can replicate properly. This has to be right and affects the setting of sitereplicationfactor/sitesearchfactor (eg. origin:1, total:4). There should be no "site" with no indexers assigned. In other words, every site should have at least one indexer assigned to it.

Every search head must have a site specified for the purpose of "search affinity." This allows the cluster master to direct the search head to the most appropriate indexers for searching. The specification of search head site has nothing to do with replication.

Since you don't actually have indexers at sites Alpha and Bravo, I would specifically:

  1. Configure your cluster master and indexers as a four-site cluster. Pretend that Alpha and Bravo do not exist.
  2. For the search heads at Alpha, identify the closest (best-performing) index site. Assign the Alpha search heads to that site. Now the Alpha search heads will search the nearest site; if it is not available, the cluster master will redirect the search heads to a surviving site.
  3. Repeat step 2 for the Beta search heads.

This solution does not require set indexing-ready on the cluster master (although that's not a bad practice anyway).
It also allows you to pick the optimum site for search heads to search.
It is less of a hack and better aligned with the way indexer clustering works in Splunk.

0 Karma

Splunk Employee
Splunk Employee

that will be fine, as long as your sitereplicationfactor/sitesearchfactor allow for it. (if you have something like origin:1, total:6, we'll try to put 1 copy in each site, which then no longer meets RF/SF since theres no peers on Alpha/Bravo. So, origin:1, total:4 would be okay, as would origin:2,total:3)

make sure to call ./splunk set indexing-ready on the Cluster Master everytime it restarts, otherwise it'll be waiting for peers from Alpha/Bravo to join before it starts scheduling cluster activities.

View solution in original post