Deployment Architecture

What is the compression ratio of raw data in Splunk?

lal37
Explorer

Hi Team,

Any one please let me know what is the compression ratio of raw data in Splunk?
I heared that its 10:1 ratio that's means 1% of the original raw logs and also 1% index file size.
Please any one explain what is the compression ration in splunk when its storing data on indexer.
Ex:
If am having 100 GB of logs how much indexer space it take?Is it 10GB of indexer space?

Regards,
lal

Tags (3)

edoardo_vicendo
Builder
0 Karma

mhassan
Path Finder

The docs say a 100G incoming is broken to 15% for raw data (journal.gz file) and 35% for meta data (tsidx files). So your 100G will occupy ~50G space. Be aware that is an average. Different ASCII files has various compression ratios (base on repeated patterns).

santiagoaloi
Path Finder

This site will avoid you many headaches:

https://splunk-sizing.appspot.com

0 Karma

gfuente
Motivator

Hello

It´s usually about half of the original size, so for your question 100GB would need about 50gb, from those around 10gb would be the original logs zipped, and 40gb the indexes

Regards

gfuente
Motivator

Yes, thats right.

Those figures are approximated, it depends of the data itself, but as a rule of thumb you can calculate 10% for raw data compressed plus 40% for indexes

Regards

lal37
Explorer

Hi gfuente,
Thanks for the promot response.
According to your response i have one small querie original log file takes only 10gb of 100 GB original size so that means 10:1 compression ratio right??
Also can you please confirm whether indexes take 40 gb of space for 100 gb logs.

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...