Deployment Architecture
Highlighted

Help in estimating storage size

Builder

Hello Splunkers,

Below is our retention requirement while aiming to index approx 250GB of data per day in only 1 Indexer
hot - 60 days - Tier 1- SSD
cold - 4 months - Tier 2 - 10K RPM
Frozen – (12 Months) - tier 3 - 7200 RPM

I have already got an estimate from splunk-sizing(dot)appspot(dot)com, however, as the IOPS readings based on the RAID configuration on that site dont really match that accurately with the storage requirement I have, I am not too sure how reliable the estimate is.

Hence, please help me figure out the right estimate for this requirement.

Thank you.

0 Karma
Highlighted

Re: Help in estimating storage size

SplunkTrust
SplunkTrust

can you elaborate? are you asking about storage or IOPS? or both
for storage, divide the daily index rate by 2 as splunk compresses the data aprox 50%
now multiple:
hot 250/2 X 60 = 7,500GB or 7.5TB
cold 250/2 X 120 = 15,000 or 15TB
frozen = 250/2 X 365 = ~45,000 or ~45TB
if you are planing to use ES or in need of data replication for clustering, calculations are different

hope it helps

0 Karma
Highlighted

Re: Help in estimating storage size

Builder

Hi @adonio,

Thanks for your reply.

My question is mainly about estimating the storage and whether the IOPS of the storage would in any way impact the size of the storage. We are only planning to use Splunk Enterprise.

EDIT:
one more thing I noticed on the splunk-sizing(dot)appspot(dot)com, is that its showing me the wrong value for archived/frozen storage.
With the same settings as above, if you see the results, it gives, for cold , i.e 120 days - 11.7 TB
and for frozen/archived storage - 365 days, it shows 10.5 TB.
Not sure, how it came up with that value ?

0 Karma
Highlighted

Re: Help in estimating storage size

Champion

"HOT,WARM and COLD" are calculated with 50% compression ratio.
"frozen" is calculated with a compression ratio of 15%.

You also need to consider the capacity when returning "frozen" and the capacity when using the data model (accelerated).
250G * 50% * 10days is necessary if it returns for 10 days.

View solution in original post