Deployment Architecture

What is the compression ratio of raw data in Splunk?

lal37
Explorer

Hi Team,

Any one please let me know what is the compression ratio of raw data in Splunk?
I heared that its 10:1 ratio that's means 1% of the original raw logs and also 1% index file size.
Please any one explain what is the compression ration in splunk when its storing data on indexer.
Ex:
If am having 100 GB of logs how much indexer space it take?Is it 10GB of indexer space?

Regards,
lal

Tags (3)

mhassan
Path Finder

The docs say a 100G incoming is broken to 15% for raw data (journal.gz file) and 35% for meta data (tsidx files). So your 100G will occupy ~50G space. Be aware that is an average. Different ASCII files has various compression ratios (base on repeated patterns).

0 Karma

santiagoaloi
Path Finder

This site will avoid you many headaches:

https://splunk-sizing.appspot.com

0 Karma

gfuente
Motivator

Hello

It´s usually about half of the original size, so for your question 100GB would need about 50gb, from those around 10gb would be the original logs zipped, and 40gb the indexes

Regards

gfuente
Motivator

Yes, thats right.

Those figures are approximated, it depends of the data itself, but as a rule of thumb you can calculate 10% for raw data compressed plus 40% for indexes

Regards

lal37
Explorer

Hi gfuente,
Thanks for the promot response.
According to your response i have one small querie original log file takes only 10gb of 100 GB original size so that means 10:1 compression ratio right??
Also can you please confirm whether indexes take 40 gb of space for 100 gb logs.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!