Deployment Architecture

Multi-site cluster configuration help with 2 cluster peers?

agodoy
Communicator

Hello Everyone,
I am having trouble understanding the site_replication_factor and site_search_factor options.

I have two sites with one search peer each. I want the data that one site receives to be replicated to the other site.

I only have one search head in the cluster and the cluster master.

I am using the following:

[clustering]
mode = master
replication_factor=1
search_factor=1
multisite=true
available_sites=site1,site2
site_replication_factor = origin:1,total:1
site_search_factor = origin:1,total:1

I also tried the same but with the values below:

[clustering]
mode = master
replication_factor=2
search_factor=2
multisite=true
available_sites=site1,site2
site_replication_factor = origin:1,total:2
site_search_factor = origin:1,total:2

In the second scenario, the cluster never reached search and replication factors.

Currently, I am running the cluster with the configuration that is listed first. When I take out a search peer, the data that was in it was not replicated to the other peer.

Any ideas on what the proper values for site_replication|search_factor should be to get the full set of data even when a peer is down.

Tags (2)
1 Solution

frmaasdam
Path Finder

You cannot define a multi site cluster with only 2 cluster peers. You need at least 3.
When you have 2 cluster peers each in one site, you have a single site cluster even if the second cluster peer is in the other side. Confused? 😉
You set your site name
replication_factor = 2
search_factor = 1 or 2. That depends if the replicated data must me searchable immediately or not at the replicated site.
If not, your data will be made searchable when one of your cluster peer goes away, but that takes time and maybe a lot of time.
If the search factor = 2 your data is replicated and searchable at the same time.

When you have 3 cluster peers you can define a multi site cluster.
In one side with 2 cluster peers and in the other side 1.
Now you can define your multi site replication factors. Where do you want the copies to rest.
multisite = true
site = site 1 on two peers and site = site 2 on the other peer.
Now if you set you replication factor to 2, you do not want to have the copy on the other peer on the same site 1 if you origin data is on the other cluster peer on the same site 1.
In that case there is no replicated data in the other site, site 2.
There is where the multi site replication factor comes in place in place of the origin replication_factor!
available_sites = site1, site2
If you are going to define all sites explicit the total must be the total of all sites named explicit.
site_replication_factor = origin:1,site1:1,site2:1,total:2
Then if the origin site have 1 copy the replicated data will be forced to the other site.
If site1 cluster peer 1 is the origin a replicated copy will go to the cluster peer in site2.
If site1 cluster peer 2 is the origin a replicated copy will go to the cluster peer in site2.
If the cluster peer in site2 is the origin a replicated copy will go to 1 of the 2 cluster peers in site1.
So the origin site has always 1 copy. site1 has always one copy and site2 has always 1 copy. In total 2 copies.

Then you can set your site_search_factor
site_search_factor = origin:1,total:1 or total:2. That depends as earlier said.

Frank Maasdam

View solution in original post

dxu_splunk
Splunk Employee
Splunk Employee

using the settings of your first example, you can do:

site_replication_factor = origin:1,total:2
site_search_factor = origin:1,total:2

with replication_factor=1, search_factor=1.

however, this is identical to non-multisite clustering. the only advantage of this is that its easier to move from 1 peer per site -> multiple peers per site than to reconfigure.

0 Karma

frmaasdam
Path Finder

You cannot define a multi site cluster with only 2 cluster peers. You need at least 3.
When you have 2 cluster peers each in one site, you have a single site cluster even if the second cluster peer is in the other side. Confused? 😉
You set your site name
replication_factor = 2
search_factor = 1 or 2. That depends if the replicated data must me searchable immediately or not at the replicated site.
If not, your data will be made searchable when one of your cluster peer goes away, but that takes time and maybe a lot of time.
If the search factor = 2 your data is replicated and searchable at the same time.

When you have 3 cluster peers you can define a multi site cluster.
In one side with 2 cluster peers and in the other side 1.
Now you can define your multi site replication factors. Where do you want the copies to rest.
multisite = true
site = site 1 on two peers and site = site 2 on the other peer.
Now if you set you replication factor to 2, you do not want to have the copy on the other peer on the same site 1 if you origin data is on the other cluster peer on the same site 1.
In that case there is no replicated data in the other site, site 2.
There is where the multi site replication factor comes in place in place of the origin replication_factor!
available_sites = site1, site2
If you are going to define all sites explicit the total must be the total of all sites named explicit.
site_replication_factor = origin:1,site1:1,site2:1,total:2
Then if the origin site have 1 copy the replicated data will be forced to the other site.
If site1 cluster peer 1 is the origin a replicated copy will go to the cluster peer in site2.
If site1 cluster peer 2 is the origin a replicated copy will go to the cluster peer in site2.
If the cluster peer in site2 is the origin a replicated copy will go to 1 of the 2 cluster peers in site1.
So the origin site has always 1 copy. site1 has always one copy and site2 has always 1 copy. In total 2 copies.

Then you can set your site_search_factor
site_search_factor = origin:1,total:1 or total:2. That depends as earlier said.

Frank Maasdam

agodoy
Communicator

I guess single site cluster it is. Thanks for the detailed answer.

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...