Hi Team,
Any one please let me know what is the compression ratio of raw data in Splunk?
I heared that its 10:1 ratio that's means 1% of the original raw logs and also 1% index file size.
Please any one explain what is the compression ration in splunk when its storing data on indexer.
Ex:
If am having 100 GB of logs how much indexer space it take?Is it 10GB of indexer space?
Regards,
lal
See my reply here if it can help
The docs say a 100G incoming is broken to 15% for raw data (journal.gz file) and 35% for meta data (tsidx files). So your 100G will occupy ~50G space. Be aware that is an average. Different ASCII files has various compression ratios (base on repeated patterns).
This site will avoid you many headaches:
Hello
It´s usually about half of the original size, so for your question 100GB would need about 50gb, from those around 10gb would be the original logs zipped, and 40gb the indexes
Regards
Yes, thats right.
Those figures are approximated, it depends of the data itself, but as a rule of thumb you can calculate 10% for raw data compressed plus 40% for indexes
Regards
Hi gfuente,
Thanks for the promot response.
According to your response i have one small querie original log file takes only 10gb of 100 GB original size so that means 10:1 compression ratio right??
Also can you please confirm whether indexes take 40 gb of space for 100 gb logs.
see these
http://answers.splunk.com/answers/57248/compression-rate-of-indexed-data-50gigday-in-3-weeks-uses-10...
http://answers.splunk.com/answers/106904/trying-to-understand-compression-given-compression-of-x-vol...
http://answers.splunk.com/answers/52075/compression-rate-for-indexes-hot-warm-cold-frozen
see the Storage Requirement Examples section in http://docs.splunk.com/Documentation/Splunk/6.1.2/Indexer/Systemrequirements for more details