Deployment Architecture

Have we correctly estimated storage for our move to an indexer clustering environment?

sbattista09
Contributor

We plan on moving to a clustered environment soon, so we are starting to dive into what we need storage wise. Based off Splunk documentation:(http://docs.splunk.com/Documentation/Splunk/6.2.0/Indexer/Systemrequirements) we will need about one terabyte added to accommodate for all our hot buckets. I would like to make sure these numbers are correct.

Examples:
• 3 peer nodes, with replication factor = 3; search factor = 2: This requires a total of 115GB across all peer nodes (averaging 38GB/peer), calculated as follows:
o Total rawdata = ( 15GB * 3) = 45GB.
o Total index files = ( 35GB * 2) = 70 GB.
• 5 peer nodes, with replication factor = 5; search factor = 3: This requires a total of 180GB across all peer nodes (averaging 36GB/peer), calculated as follows:
o Total rawdata = ( 15GB * 5) = 75GB.
o Total index files = ( 35GB * 3) = 105 GB.

Our planned environment-
Peers: 4
Replication: 4
Search factor: 2

We have about 960gb of hot buckets-
~960GB / 2 = 480GB

144GB Raw Data
336GB Assoc Index FIles

RawData ( 144 * 4) = 576GB
Index Files ( 336 * 2) = 672GB
I wanted to make sure this is true and sounds correct that we will need an additional 1248GB per indexer?
http://docs.splunk.com/Documentation/Splunk/6.2.0/Indexer/Systemrequirements

0 Karma

Lucas_K
Motivator

How robust are you individual systems? A replication factor of 4 is awfully high. That is 4 ENTIRE copies of the data spread across each node. I think you may have misinterpretation what replication factor means.

Are you expecting multiple entire individual systems to become unrecoverable? Each replication number means one entire copy of all indexers data spread across the entire platform. This also means that for each GB of original data indexed on a single peer atleast another 500MB will be written on EVERY other node in the system for the same data (assume 2:1 minimum compression). You can potentially run out of available IOPS (for both searching and indexing) depending on your normal ingestion rate.

Normally you have raid which protects individual file systems. But say the entire raid fails. Then you have replicated copies to fix this.
With a replication factor of 4 you are expecting that you will have 3 entire systems with unrecoverable data.

I don't know your situation so it seems unlikely that would be required. You may want to revisit that setting.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Disclaimer: this tool is NOT supported by Splunk. However, it may be useful to you:

http://splunk-sizing.appspot.com/

sbattista09
Contributor

thanks for this link, it did help out in our planning.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...