Splunk Search

compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

jan_wohlers
Path Finder

Hey,

we just set up a indexer 3 weeks ago. By now we are indexing about 50gig/24h. If I go to Manager -> Indexes I can see that our main index only has a size of about 100gigs. Mostly just eventlogs will be indexed. Is there such a good compression that about 20 days of 50gigs/day will be stored in an 100gig index?

Thanks for your answer in advance!

Jan

Tags (1)
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

jan_wohlers
Path Finder

Okay, thanks for the answer. Compressionrate is 21% which seems pretty good.

0 Karma

ibondarets
Explorer

Thank you for this handy search example!
I've couple of questions regarding it:
1) how can i build a search which will give me a table of all indexes present wih compression ratio information? I tried this:

| dbinspect index=*
  | fields state,id,rawSize,sizeOnDiskMB 
  | stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB by index
  | eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
5.  | eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
  | table rawTotalinMB, diskTotalinMB, compression

but it didn't work.
2) what does it mean when I get this:
alt text

0 Karma

ConnorG
Path Finder

I have a similar compression percentage as ibondarets.

Guess that means our data is actually larger once indexed.

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...