Deployment Architecture

Storage size thoughts and calculations

horsefez
Motivator

Hi fellow Splunkers!

I'm currently trying to figure out how much data storage my deployment of splunk would need if I would index up to 10 GB of data per day.

What Splunk thinks about it:

Typically, the compressed rawdata file is 10% the size of the incoming, pre-indexed raw data. The associated index files range in size from approximately 10% to 110% of the rawdata file. The number of unique terms in the data affect this value.

http://docs.splunk.com/Documentation/Splunk/6.3.3/Capacity/Estimateyourstoragerequirements

What I think about it:

I have a log volume of 10 GB per day.
This would be an estimated rawdata size of... 10 GB.
This would be an estimated compressed(10%) rawdata size of 1 GB.
The maximal possible compression(10%) of indexdata could be 1 GB.
The minimal possible compression(110%) of indexdata would be 11 GB.

As a result I would have to make room for storage between at least 2 GB and max 12 GB for every day I want to store data.
Am I right?

What you think about it:
...

0 Karma
1 Solution

jmallorquin
Builder

Hi,

You can use this tool.

https://splunk-sizing.appspot.com/

Hope i help you

View solution in original post

woodcock
Esteemed Legend

The general rule of thumb calculation is:

raw_daily_bandwidth * days-to-retain-data * index-replication-factor / 2 (includes reduction due to compression and bloating due to indexing overhead ASSUMING NOT USING `indexed_extractions`)
0 Karma

gwiley_splunk
Splunk Employee
Splunk Employee

One key question here is how long do you want to keep the data for?

Other questions worthy of consideration are:

  • are you planning to use data models and/or summary indexes? Note that some apps on splunkbase may use data models or summary indexes.
  • are you running a cluster and planning to use index replication?

The splunk-sizing web app will help you get most of the way there and allows you to specify storage contingency.

Cheers, Greg.

0 Karma

horsefez
Motivator

Thanks to you, too! 🙂
What would change if I'm planning to use data models?
Do I need to save even more data?

0 Karma

jmallorquin
Builder

Only if you accelerate the data models.

0 Karma

jmallorquin
Builder

Hi,

You can use this tool.

https://splunk-sizing.appspot.com/

Hope i help you

horsefez
Motivator

Wow, this is an amazing tool. Thank you! 🙂

0 Karma
Get Updates on the Splunk Community!

Detecting Brute Force Account Takeover Fraud with Splunk

This article is the second in a three-part series exploring advanced fraud detection techniques using Splunk. ...

Buttercup Games: Further Dashboarding Techniques (Part 9)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Buttercup Games: Further Dashboarding Techniques (Part 8)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...