Getting Data In

How to calculate my total indexer storage capacity?


Hello community.
Thank you for looking at my question,
I am a Splunk newbie and probably have the dumbest question ever asked.

I have indexers, Heavy Forwarders and Search Heads,
My manager asked me to calculate if we are capable of ingesting 11.5TB of data for a retention of 3months, 11.5TB will be the total retention we will need.

I have several Indexers distributed, and several indexers, they gave me a clue that i have to sum all indexers, and that will give me free total space.

What I would like to know is:
is there an easy way to see on the deployer, or health check or in any other place or by a search string the real capacity I have, and the amount that I have left?
As I mentioned I have 11 indexers, some of they say show 20% used, others show 110% used, etc.

How is the appropriate way to calculate real free space?

Thank you!

0 Karma

Ultra Champion

No, Its a bit more complicated that that..

But I would start with the easy ones first:

1.) How big is your licence (and in particular, what is your average daily usage?) If your licence is only 200GB you need to plan carefully how you ingest that data without violating your licence - or maybe you take the hit, index it all on one day and accept warnings for 30 days?

2.) Are your indexers clustered? - If they are, you will need to find out the replication factor for your target index - if your RF was 2, you would need at least 2x the indexed space for the data as the cluster will hold two copies, not to mention additional space for SF serviceable copies too.

3.) If your indexes are not clustered, then its a bit easier, as it comes 'closer' to being the sum of available disk space - however:

4.) You will also need to allow for more than just the data volume, as tsidx files consume space, and you need a little wiggle room for processing etc.

5.) what is the data, and how will you be getting it in? Is it already extracted and data you have processed before - you don't want to import 11TB and find out your line breaking was wrong !

6.) It might not actually be 11.5TB. Just because thats what it looks like on disk, Splunk could be under (or in rare cases) over that size - normally under!

7.) if you have a few indexers with an abundance of space, why not create a new index just on those indexers. That way you don't have to worry about running out of disk on your 'fuller' peers.

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out >> As our brave ...