Solved: Multi-site indexer clustering: why isn't data sour...

davidpaper · ‎06-18-2018

Scenario:
multi-site cluster
site1 and site2
site_rep_factor=origin:2, total:3
site_search_factor=origin:2, total:3

bucket12345 has 2 copies in site1 (origin) and 1 copy in site2.

When a copy of the bucket is deleted in the origin site1 (the rb_* copy), the CM kicks off a job to make a new copy of that bucket. I see it being copied from an indexer in site2, instead of an indexer in site1. I expected Splunk to use a copy in the same site as the source, but it's not doing that.

Why?

davidpaper · ‎06-18-2018

The logic behind bucket replication sourcing works like this:

1) We will prefer a local site source for RF replication.

2) However, if the local sources is already at max capacity for how many replications it can be involved in (max_peer_rep_load), then we can definitely go cross site for RF replication.

3) For SF replication, there is no preferences, it ends up being random.

On the CM in server.conf: [clustering] max_peer_rep_load can be used to throttle up/down how many replication jobs are happening at once. Lowering this will slow down non-streaming (warm/cold) bucket replication, but will not affect streaming (hot) bucket replication. This value represents "slots" for each indexer to participate in non-streaming replication, either as a source or as a target.

Huh, what? Need an example.

Imagine 3 peers on site1, with bucket A and B that we want to be replicated intrasite (site1 needs to have 2 copies of A and B buckets), and max_peer_rep_load=1 (for simplified example), and 1 peer on site2:

Site1:
Peer1 - Bucket A
Peer2 - Bucket B
Peer3 - Bucket C

Site2:
Peer4 - Bucket B

We may trigger replication of Bucket A on Peer1->Peer2. Since Peer1 & 2 are involved in a replication, both of the "peer rep" slots are now taken on Peer1 and Peer2.

Peer3 has a slot available, so it can get a replication of BucketB from some outside site (Peer4 in site2) since Peer2 doesn't have a slot available, thus triggering an inter-site copy.

Unfortunately, when we fix buckets, we fix them in some fixed (but random) order, and if the bucket we're scheduling next for replication doesn't have a Source on the local site, it will go to an alternate site.

A huge thank you to @dxu_splunk for the background to answer the question.

-dave

View solution in original post

davidpaper · ‎06-18-2018