Deployment Architecture

Why is the index size much larger than the actual data by 3x to 4x?

red2play
Loves-to-Learn

Index Size is 5.3G vs 1.6G Raw Data:

Raw data 

red2play_1-1699030026808.png

Index on Splunk

red2play_0-1699029952245.png

 

This is also affecting our licensing plans as well.  This is way bigger than anticipated.  I thought that 110% ro maybe even 180% but not 400%.  Somethings off.

 

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

I don't know what those numbers are but remember that just because you're ingesting 1GB of data daily doesn't mean you're gonna consume 1GB of disk space daily.

Firstly, you store compressed raw data. It's gzipped so it compresses fairly well as text data generally does. It takes up around 1/7, maybe 1/6 of the original raw data size on average.

Along that you store index files which make up around another 1/3 of the original raw data size.

Roughly estimating you need about 1/2 of the original raw data to store just index data.

But if you don't use separate storage you also need to account for additional summaries if you use accelerations.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...