Archive

Splunk compression rate for archiving data

Path Finder

i have to set up a Archiving policy and storage requirements in SPlunk. Estimated logs per day would be 100 GB. So if i go by documentation SPlunk will index 50 GB(with a compression rate of 50%). Then As the data will get old it same move 50 Gb of data from Hot->Warm->Cold. At this point i will setup a archival policy to S3(AWS). I wanted to know whether splunk will archive whole 50GB or 100 Gb data in S3 and What amount of data will be indexed back. Is it going to be 50Gb>

Please help

Tags (1)
0 Karma

Communicator

Has anything changed in this topic?

Are these calculations actual (I mean about 15% for data and about 35% for metadata)?

0 Karma

Ultra Champion

Normally, on average, Splunk will compress raw data to about half the size, or thereabouts. So your original 100GB will now be 35GB of index-files and 15GB of compressed data, according to a rough estimate.

When data is frozen - which is what I assume you mean by "archival policy", only the compressed data is saved, and the index-files are deleted. So only about 15% of the original size of the raw data is archived. 15GB

When/if you need to restore archived (frozen) data, you will need to rebuild the index-files before you can search it again. Back to 15+35 GB.

/K

Ultra Champion

So the "50%" would be the size of the bucket as a whole, compared to the uncompressed .gz found in its rawdata directory.

This can vary from bucket to bucket, and will depend on the compressability of the log data coming in. Over a diverse set of log sources, the figure "50%" is commonly mentioned as an average compression rate.

0 Karma

Ultra Champion

Check /opt/splunk/var/lib/splunk/defaultdb/db/

That's where the 'main' index (defaultdb) is stored. In this folder you will find the hot and warm buckets as subdirs, e.g. db_1234123412_12341234325_33

Inside a bucket there will be some metadata files and .tsidx-files (indexes for searching the raw data). Finally there will be a directory called 'rawdata' that contains the zipped raw data.

0 Karma

Path Finder

How can i check the compressed data size?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!