Deployment Architecture

Why is the index size much larger than the actual data by 3x to 4x?

red2play
Loves-to-Learn

Index Size is 5.3G vs 1.6G Raw Data:

Raw data 

red2play_1-1699030026808.png

Index on Splunk

red2play_0-1699029952245.png

 

This is also affecting our licensing plans as well.  This is way bigger than anticipated.  I thought that 110% ro maybe even 180% but not 400%.  Somethings off.

 

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

I don't know what those numbers are but remember that just because you're ingesting 1GB of data daily doesn't mean you're gonna consume 1GB of disk space daily.

Firstly, you store compressed raw data. It's gzipped so it compresses fairly well as text data generally does. It takes up around 1/7, maybe 1/6 of the original raw data size on average.

Along that you store index files which make up around another 1/3 of the original raw data size.

Roughly estimating you need about 1/2 of the original raw data to store just index data.

But if you don't use separate storage you also need to account for additional summaries if you use accelerations.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...