• 3 peer nodes, with replication factor = 3; search factor = 2: This requires a total of 115GB across all peer nodes (averaging 38GB/peer), calculated as follows:
o Total rawdata = ( 15GB * 3) = 45GB.
o Total index files = ( 35GB * 2) = 70 GB.
• 5 peer nodes, with replication factor = 5; search factor = 3: This requires a total of 180GB across all peer nodes (averaging 36GB/peer), calculated as follows:
o Total rawdata = ( 15GB * 5) = 75GB.
o Total index files = ( 35GB * 3) = 105 GB.
How robust are you individual systems? A replication factor of 4 is awfully high. That is 4 ENTIRE copies of the data spread across each node. I think you may have misinterpretation what replication factor means.
Are you expecting multiple entire individual systems to become unrecoverable? Each replication number means one entire copy of all indexers data spread across the entire platform. This also means that for each GB of original data indexed on a single peer atleast another 500MB will be written on EVERY other node in the system for the same data (assume 2:1 minimum compression). You can potentially run out of available IOPS (for both searching and indexing) depending on your normal ingestion rate.
Normally you have raid which protects individual file systems. But say the entire raid fails. Then you have replicated copies to fix this.
With a replication factor of 4 you are expecting that you will have 3 entire systems with unrecoverable data.
I don't know your situation so it seems unlikely that would be required. You may want to revisit that setting.