Splunk Search

compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

jan_wohlers
Path Finder

Hey,

we just set up a indexer 3 weeks ago. By now we are indexing about 50gig/24h. If I go to Manager -> Indexes I can see that our main index only has a size of about 100gigs. Mostly just eventlogs will be indexed. Is there such a good compression that about 20 days of 50gigs/day will be stored in an 100gig index?

Thanks for your answer in advance!

Jan

Tags (1)
1 Solution

MuS
Legend

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

View solution in original post

edoardo_vicendo
Builder
0 Karma

MuS
Legend

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

jan_wohlers
Path Finder

Okay, thanks for the answer. Compressionrate is 21% which seems pretty good.

0 Karma

ibondarets
Explorer

Thank you for this handy search example!
I've couple of questions regarding it:
1) how can i build a search which will give me a table of all indexes present wih compression ratio information? I tried this:

| dbinspect index=*
  | fields state,id,rawSize,sizeOnDiskMB 
  | stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB by index
  | eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
5.  | eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
  | table rawTotalinMB, diskTotalinMB, compression

but it didn't work.
2) what does it mean when I get this:
alt text

0 Karma

ConnorG
Path Finder

I have a similar compression percentage as ibondarets.

Guess that means our data is actually larger once indexed.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...

Index This | How many sevens are there between 1 and 100?

August 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...