Splunk Dev

Do we need to have RF for all the indexers in cluster

santosh11
New Member

Dear All,

We have a cluster environment where we have 7 search heads and 5 indexers and just i was reading and got a doubt that in my 5 indexers do i need to have:
1)RF (replication factor) =5. Replication is for the data availability but do we need 5 systems to have the data ? It will impact the performance of the system right as we are replicating the data in 5 different systems?

2) Now lets say we make it to RF=3 now but going forward we have to use 5 indexers and make RF=5 so if we make it 3 now and later make it 5 how the data and buckets will get impacted?

3) How to use Search Factor correctly?

As our application is huge and now we are having 50 GB data per day but going forward it will increase and reach 1 TB or more par day.

So wants to understand the process of how to get it work

Regards,
Santosh

0 Karma
1 Solution

harsmarvania57
Ultra Champion

Hi,

You do not require to set RF = Number of Indexers, Replication Factor means number of data copies you want to in your environment. For example: If you set RF=2 in your environment (Assuming you have 5 Indexer and single site cluster), in this case when data arrive on Indexer1, that Indexer1 store one copy of data and replicate another copy of data to any one of the remaining four Indexers.

Now assume your that another copy of data stored on Indexer4 and Indexer4 went down, in this case cluster master will go into "RF not met" stage and cluster master will start replicate that data from Indexer1 (On which one copy of data is available) to any of the remaining Indexer so that RF will meet. When you set RF=5, means you will be going to store same copy of data 5 times which means you require more storage. So now it depends on you to decide how many RF you want to set for any worst case scenario.

If you set RF=3 initially and then RF=5 then cluster will replicate all the bucket which has RF=3 to RF=5 means there will be more bucket fixup activity when you change from RF=3 to RF=5 and how long that bucket fixup activity will be going on that is purely depend on your environment.

Search Factor means searchable copy of data, have look at documentation which explain very good about search factor https://docs.splunk.com/Documentation/Splunk/7.3.1/Indexer/Thesearchfactor

If your data ingestion will grow from 50GB/day to 1TB/day then I'll suggest you to engage Professional Services to design architecture in better and efficient way.

View solution in original post

0 Karma

harsmarvania57
Ultra Champion

Hi,

You do not require to set RF = Number of Indexers, Replication Factor means number of data copies you want to in your environment. For example: If you set RF=2 in your environment (Assuming you have 5 Indexer and single site cluster), in this case when data arrive on Indexer1, that Indexer1 store one copy of data and replicate another copy of data to any one of the remaining four Indexers.

Now assume your that another copy of data stored on Indexer4 and Indexer4 went down, in this case cluster master will go into "RF not met" stage and cluster master will start replicate that data from Indexer1 (On which one copy of data is available) to any of the remaining Indexer so that RF will meet. When you set RF=5, means you will be going to store same copy of data 5 times which means you require more storage. So now it depends on you to decide how many RF you want to set for any worst case scenario.

If you set RF=3 initially and then RF=5 then cluster will replicate all the bucket which has RF=3 to RF=5 means there will be more bucket fixup activity when you change from RF=3 to RF=5 and how long that bucket fixup activity will be going on that is purely depend on your environment.

Search Factor means searchable copy of data, have look at documentation which explain very good about search factor https://docs.splunk.com/Documentation/Splunk/7.3.1/Indexer/Thesearchfactor

If your data ingestion will grow from 50GB/day to 1TB/day then I'll suggest you to engage Professional Services to design architecture in better and efficient way.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...