Archive

Search factor vs Replication factor

Builder

i know that setting RF=2 ensures 2 copies of buckets on available indexers. so this consume 2X times of space/disk.
now i also know that only the primary copy is searchable, ie SF=1 , is this the default setting for SF ./

question : if i change my SF=2 , does this mean 2 copies are changed to primary ? so 2 copies are searchable ?
does SF increase the space requirement when changed from 1 to 2 ?

is this increase in space same as RF, ie double the space or some percentage of it...

Tags (1)
1 Solution

Motivator

Hi @vishaltaneja07011993 ,

Let's understand search factor and replication factor.
Replication Factor - Number of copies of buckets.
Search Factor - Number of searchable copies of buckets.

RF=2 & SF=1 consume 2X times of space/disk - Wrong!!! It takes somewhat lesser space. If RF=2 and SF=2 then it will take exact 2X disc space. Searchable buckets contains TSIDX and bloom filter apart from raw data. Hope based on that you can understand the space requirement.

Coming to primary buckets, primary buckets will always only one. It tells splunk which are the buckets to search. If any search peer goes down splunk will find other searchable buckets and make is primary if not found it make non-searchable bucket searchable and then make it primary.

Hope you understand the difference between RF and SF. And also importance of primary buckets.

View solution in original post

Motivator

Hi @vishaltaneja07011993 ,

Let's understand search factor and replication factor.
Replication Factor - Number of copies of buckets.
Search Factor - Number of searchable copies of buckets.

RF=2 & SF=1 consume 2X times of space/disk - Wrong!!! It takes somewhat lesser space. If RF=2 and SF=2 then it will take exact 2X disc space. Searchable buckets contains TSIDX and bloom filter apart from raw data. Hope based on that you can understand the space requirement.

Coming to primary buckets, primary buckets will always only one. It tells splunk which are the buckets to search. If any search peer goes down splunk will find other searchable buckets and make is primary if not found it make non-searchable bucket searchable and then make it primary.

Hope you understand the difference between RF and SF. And also importance of primary buckets.

View solution in original post

Builder

Hi Vatsal, thanks for your reply.

im trying to understand this with example below.

if bucket size is 100GB, then RF=2, will result in 200GB, right.
if SF=1, then this includes index+rawdata+bloom filter = > 200GB, which is more than 2X times. right..
then if SF=2, then it will be 2X times of bucket+searchable.

am i understanding this right..

i understand about primary buckets. so i can have a primary for each site, in case of multi site ...

Motivator

Correct @jiagya and @vishaltaneja07011993,

Bucket can be searchable and non-searchable.
Non-searchable = raw-data
Searchable = raw-data + tsidx + bloomfilter

RF = SF (searchable buckets) + Non-searchable buckets

0 Karma

Motivator

correct SF=2 means 2X time of raw + index data files.

0 Karma

Builder

Thanks, thats what i wanted to know.

0 Karma

Motivator

@jiaqya

Yes correct it will increase the disk space as well. As RF only allow to store raw data, in case of SF indexed data copies will also be there. So it will require more space.

Please find the below doc for better understanding:
https://docs.splunk.com/Documentation/Splunk/7.2.6/Indexer/Bucketsandclusters

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!