Archive
Highlighted

compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

Path Finder

Hey,

we just set up a indexer 3 weeks ago. By now we are indexing about 50gig/24h. If I go to Manager -> Indexes I can see that our main index only has a size of about 100gigs. Mostly just eventlogs will be indexed. Is there such a good compression that about 20 days of 50gigs/day will be stored in an 100gig index?

Thanks for your answer in advance!

Jan

Tags (1)
Highlighted

Re: compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

SplunkTrust
SplunkTrust

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

View solution in original post

Highlighted

Re: compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

Path Finder

Okay, thanks for the answer. Compressionrate is 21% which seems pretty good.

0 Karma
Highlighted

Re: compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

Explorer

Thank you for this handy search example!
I've couple of questions regarding it:
1) how can i build a search which will give me a table of all indexes present wih compression ratio information? I tried this:

| dbinspect index=*
  | fields state,id,rawSize,sizeOnDiskMB 
  | stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB by index
  | eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
5.  | eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
  | table rawTotalinMB, diskTotalinMB, compression

but it didn't work.
2) what does it mean when I get this:
alt text

0 Karma
Highlighted

Re: compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

Path Finder

I have a similar compression percentage as ibondarets.

Guess that means our data is actually larger once indexed.

0 Karma