Solved: Estimating Storage requirement when only internal ...

hectorvp · ‎10-03-2020

Hi Splunkers,

We need to estimate the disk space required for our single box Splunk enterprise.

We are planning to only ingest internal logs of splunkd and don't see any way, how can I estimate disk space for internal logs. Don't know how many events are generated from a UF and how much size a single event size would be.

We would be having around 400 UFs running on servers and have expectancy of 60days of retention policy.

I'm afraid if 500GB space will fill up before 60days and have wont have internal logs.

Apart from this, Please suggest if I really need RAID levels 1+0 for internal logs, there would be few schedule search for health checkups and DMC. Or any other simple storage will suffice?

Is there any way to estimate this part?

gcusello · ‎10-04-2020

Hi @hectorvp,

_internal number of events is a realy variable number and the correct approach is to see in your real situation the events and the storage occupation you have.

Anyway, you can start this analysis with a starting value of 800.000 events/day, that means around 0.07 GB/day for each server.

for your 400 UFs, are around 28.5 GB/day.

So the storage depends much on the retention policy you use.

In my projects I usually use 15 or 30 days of retention, I think that this is a useful period to analysze what happend in case of problems, I don't think that older events can be useful.

This means 425 or 850 GB of real events, that compressed are half of these values.

In conclusion, 400 UFs with a retention of 15 days on two clustered Indexers use around 215 GB on each Indexer.

About the question of RAID 1+0, it isn't mandatory, but i usually use RAID1+0 for qll Indexes also because _internal logs are ususlly used also to check if a serer is up and running, eventually to consume less high performace storage, it could be useful putting cold data on a slower storage.

Ciao.

Giuseppe

View solution in original post

gcusello · ‎10-04-2020

Hi @hectorvp,

_internal number of events is a realy variable number and the correct approach is to see in your real situation the events and the storage occupation you have.

Anyway, you can start this analysis with a starting value of 800.000 events/day, that means around 0.07 GB/day for each server.

for your 400 UFs, are around 28.5 GB/day.

So the storage depends much on the retention policy you use.

In my projects I usually use 15 or 30 days of retention, I think that this is a useful period to analysze what happend in case of problems, I don't think that older events can be useful.

This means 425 or 850 GB of real events, that compressed are half of these values.

In conclusion, 400 UFs with a retention of 15 days on two clustered Indexers use around 215 GB on each Indexer.

About the question of RAID 1+0, it isn't mandatory, but i usually use RAID1+0 for qll Indexes also because _internal logs are ususlly used also to check if a serer is up and running, eventually to consume less high performace storage, it could be useful putting cold data on a slower storage.

Ciao.

Giuseppe

hectorvp · ‎10-05-2020

Thanks @gcusello ,

Since I'm storing only internal logs in my standalone indexer, I may use raid0 or 15k rpm sas HDD....still would think over it.

Estimating Storage requirement when only internal logs

capacity planning

Linux

Windows

Splunk Observability as Code: From Zero to Dashboard

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Shape the Future of Splunk: Join the Product Research Lab!

Are you a member of the Splunk Community?