Knowledge Management

Sizing on Smartstore (S3) for local storage

ajiwanand
Path Finder

The smartstore documentation says the following:

"The amount of local storage available on each indexer for cached data must be in proportion to the expected working set. For best results, provision enough local storage to accommodate the equivalent of 30 days' worth of indexed data."

Is this the same as HOT bucket data? or is it ontop of the hot data?

e.g assuming the following factors:
Intake = 100GB/day
Compression ratio = 0.50
Hot Retention = 14 days

Using this formula found in another forum post:
Global Cache sizing = Daily Ingest Rate x Compression Ratio x (RF x Hot Days + (Cached Days - Hot Days))
Cache sizing per indexer = Global Cache sizing / No.of indexers

Cached Days = Splunk recommends 30 days for Splunk Enterprise and 90 days for Enterprise Security
Hot days = Number of days before hot buckets roll over to warm buckets. Ideally this will be between 1 and 7 but configure this based on how hot buckets rolls in your environment.

100 * .50 ( 2 x 14 + (30-14)) = 2200?

Tags (1)
0 Karma
1 Solution

nickhills
Ultra Champion

To calculate the "equivalent" Its not Hot buckets you need to calculate, its Hot+Warm.
There should only be one hot bucket per index - the one that is currently being written to. You can include it (or exclude it - its almost insignificant in this calculation)

Your formulas look more or less sensible though.

You don't mention how many indexers you have, but I assume its 3, with SF/RF 2

Daily Ingest Rate x Compression Ratio - 100 x .5 = 50GB
(RF x WARM Days + (Cached Days - WARM Days)) - I would just call this 30 days and build in some margin. Thus:
RF x Total Days Available in Cache - 2 x 30 = 60
So:
50 x 60 = 3000GB of Storage for cache
Finally:
3TB / Indexers = 1TB free storage for cache/local storage per indexer.

If my comment helps, please give it a thumbs up!

View solution in original post

nickhills
Ultra Champion

To calculate the "equivalent" Its not Hot buckets you need to calculate, its Hot+Warm.
There should only be one hot bucket per index - the one that is currently being written to. You can include it (or exclude it - its almost insignificant in this calculation)

Your formulas look more or less sensible though.

You don't mention how many indexers you have, but I assume its 3, with SF/RF 2

Daily Ingest Rate x Compression Ratio - 100 x .5 = 50GB
(RF x WARM Days + (Cached Days - WARM Days)) - I would just call this 30 days and build in some margin. Thus:
RF x Total Days Available in Cache - 2 x 30 = 60
So:
50 x 60 = 3000GB of Storage for cache
Finally:
3TB / Indexers = 1TB free storage for cache/local storage per indexer.

If my comment helps, please give it a thumbs up!

ajiwanand
Path Finder

Forgive my ignorance here, but does this mean that we do not need to calculate HOT days and that Hot days and Cached days are interchangeable?

Storage calculation would then go for Cached days (based on your formula) and Warm storage on S3?

0 Karma

nickhills
Ultra Champion

I'm approximateing that If you calculate the amount of storage (Hot+Warm) for your target time period (30 days) would consume (on a non SmartStore System) this would roughly match your desired cache Size on a SS enabled instance.

You index 100GB a day, which (regardless of whether its in hot/warm) consumes 50gb of storage.
you want to keep that searchable for 30 days.
your cluster has SF/RF of 2

50 x 30 x 2 = 3Tb
With 3 indexers, allocate each of them 1TB

If my comment helps, please give it a thumbs up!
0 Karma

ajiwanand
Path Finder

Thanks! Dont know why this concept was that hard to grasp 🙂

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...