Getting Data In

Disk space requirements

mcamilleri
Path Finder

I need to get a vague idea of disk space requirements before I start forwarding logs to a Splunk instance. Each indexed line will have on average 320 characters and I will be indexing around 500,000 lines a day.

My assumptions are 1 byte per character and I'm ignoring space taken by Splunk for indices, etc. That's 160MB per day.

Would you say that's semi-accurate or totally off the mark?

Tags (2)
0 Karma
1 Solution

adauria_splunk
Splunk Employee
Splunk Employee

The general rule of thumb I've been taught is to take your raw data size and figure about 50% of that on disk including indexes. This is due to compression reducing the size significantly, and indexing adding to the size on disk.

Of course, this is a rule of thumb, YMMV. It is recommended that you simply test it by indexing some data (e.g. with a day's or week's worth of data) and see how large the files are on disk. The actual compression / index size can vary significantly.

View solution in original post

adauria_splunk
Splunk Employee
Splunk Employee

The general rule of thumb I've been taught is to take your raw data size and figure about 50% of that on disk including indexes. This is due to compression reducing the size significantly, and indexing adding to the size on disk.

Of course, this is a rule of thumb, YMMV. It is recommended that you simply test it by indexing some data (e.g. with a day's or week's worth of data) and see how large the files are on disk. The actual compression / index size can vary significantly.

mcamilleri
Path Finder

Thanks! I don't have ready access to a Splunk instance - but that ballpark estimate should do for now.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...