Deployment Architecture

Replication Factor with N+1 indexer

anthonypradal
New Member

Hello,

I would like to know what was the workflow of the current situation.

We have setup the replication factor number to 3 and deployed a cluster of 5 indexers. Where are stored the data ? Is like the process of a RAID 5 or something else.

Could i lose 2 servers and still guarantee our data integrity ?

Thank you

0 Karma
1 Solution

sudosplunk
Motivator

The data is stored across your cluster randomly.

Let me start with some basic definitions:

Source node: The source node ingests data from forwarders or other external sources.
Target node: The target node receive streams of replicated data from the source nodes.

With respect to storing replicated data, you cannot currently specify which nodes will receive replicated data. The master determines that on a bucket-by-bucket basis, and the behavior is not configurable. You must assume that all the peer nodes will serve as targets.

At any given time, each source peer would be streaming copies of its data to two target peers, but each time it started a new hot bucket, its set of target peers could potentially change.

Could I lose 2 servers and still guarantee our data integrity ?

Short answer is Yes.

The cluster can tolerate a failure of (replication factor - 1) peer nodes. For example, a replication factor of 3 means that the cluster stores three identical copies of each bucket on separate nodes. With a replication factor of 3, you can be certain that all your data will be available if no more than two peer nodes in the cluster fail. With two nodes down, you still have one complete copy of data available on the remaining peers.

Bonus:

With a search factor of at least 2, the cluster is able to continue searching with little interruption if a peer node goes down. For example, say you specify a replication factor of 3 and a search factor of 2. The cluster will maintain three copies of all buckets on separate peers across the cluster, and two copies of each bucket will be searchable. Then, if a peer goes down and it contains a bucket copy that has been participating in searches, a searchable copy of that bucket on another peer can immediately step in and start participating in searches.

On the other hand, if the cluster's search factor is only 1 and a peer goes down, there will be a significant lag before searching can resume across the full set of cluster data.

Hope this helps!

You can find more information here.

View solution in original post

sudosplunk
Motivator

The data is stored across your cluster randomly.

Let me start with some basic definitions:

Source node: The source node ingests data from forwarders or other external sources.
Target node: The target node receive streams of replicated data from the source nodes.

With respect to storing replicated data, you cannot currently specify which nodes will receive replicated data. The master determines that on a bucket-by-bucket basis, and the behavior is not configurable. You must assume that all the peer nodes will serve as targets.

At any given time, each source peer would be streaming copies of its data to two target peers, but each time it started a new hot bucket, its set of target peers could potentially change.

Could I lose 2 servers and still guarantee our data integrity ?

Short answer is Yes.

The cluster can tolerate a failure of (replication factor - 1) peer nodes. For example, a replication factor of 3 means that the cluster stores three identical copies of each bucket on separate nodes. With a replication factor of 3, you can be certain that all your data will be available if no more than two peer nodes in the cluster fail. With two nodes down, you still have one complete copy of data available on the remaining peers.

Bonus:

With a search factor of at least 2, the cluster is able to continue searching with little interruption if a peer node goes down. For example, say you specify a replication factor of 3 and a search factor of 2. The cluster will maintain three copies of all buckets on separate peers across the cluster, and two copies of each bucket will be searchable. Then, if a peer goes down and it contains a bucket copy that has been participating in searches, a searchable copy of that bucket on another peer can immediately step in and start participating in searches.

On the other hand, if the cluster's search factor is only 1 and a peer goes down, there will be a significant lag before searching can resume across the full set of cluster data.

Hope this helps!

You can find more information here.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...