Deployment Architecture

Estimating Storage requirement when only internal logs

hectorvp
Communicator

Hi Splunkers,

We need to estimate the disk space required for our single box Splunk enterprise.

We are planning to only ingest internal logs of splunkd and don't see any way, how can I estimate disk space for internal logs. Don't know how many events are generated from a UF and how much size a single event size would be.

We would be having around 400 UFs running on servers and have expectancy of 60days of retention policy.

I'm afraid if 500GB space will fill up before 60days and have wont have internal logs.

Apart from this, Please suggest if I really need RAID levels 1+0 for internal logs, there would be few schedule search for health checkups and DMC. Or  any other simple storage will suffice? 

Is there any way to estimate this part?

Labels (3)
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @hectorvp,

_internal number of events is a realy variable number and the correct approach is to see in your real situation the events and the storage occupation you have.

Anyway, you can start this analysis with a starting value of 800.000 events/day, that means around 0.07 GB/day for each server.

for your 400 UFs, are around 28.5 GB/day.

So the storage depends much on the retention policy you use.

In my projects I usually use 15 or 30 days of retention, I think that this is a useful period to analysze what happend in case of problems, I don't think that older events can be useful.

This means 425 or 850 GB of real events, that compressed are half of these values.

In conclusion, 400 UFs with a retention of 15 days on two clustered Indexers use around 215 GB on each Indexer.

About the question of RAID 1+0, it isn't mandatory, but i usually use RAID1+0 for qll Indexes also because _internal logs are ususlly used also to check if a serer is up and running, eventually to consume less high performace storage, it could be useful putting cold data on a slower storage.

Ciao.

Giuseppe

 

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @hectorvp,

_internal number of events is a realy variable number and the correct approach is to see in your real situation the events and the storage occupation you have.

Anyway, you can start this analysis with a starting value of 800.000 events/day, that means around 0.07 GB/day for each server.

for your 400 UFs, are around 28.5 GB/day.

So the storage depends much on the retention policy you use.

In my projects I usually use 15 or 30 days of retention, I think that this is a useful period to analysze what happend in case of problems, I don't think that older events can be useful.

This means 425 or 850 GB of real events, that compressed are half of these values.

In conclusion, 400 UFs with a retention of 15 days on two clustered Indexers use around 215 GB on each Indexer.

About the question of RAID 1+0, it isn't mandatory, but i usually use RAID1+0 for qll Indexes also because _internal logs are ususlly used also to check if a serer is up and running, eventually to consume less high performace storage, it could be useful putting cold data on a slower storage.

Ciao.

Giuseppe

 

hectorvp
Communicator

Thanks @gcusello ,

Since I'm storing only internal logs in my standalone indexer, I may use raid0 or 15k rpm sas HDD....still would think over it.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...