Deployment Architecture

Should Hot/Warm and Cold storage be on physically different volumes?

fredclown
Builder

We are in the process of building out a whole new Splunk environment. As a result we are trying to be thoughtful about every piece of the new environment to make it as efficient as possible.

One question I have about the hot/warm and cold storage is should these be on physically different volumes? I guess one advantage I see about having them on the same volume is that when the buckets  roll to cold the data doesn't have to be moved to a different volume thus saving some speed there. However, I also want to consider read/write contention and having separate volumes means that the cold reads wouldn't interfere with the read/writes on the hot/warm volume. My gut tells me to do separate volumes but I've not seen anything in the docs recommending one or the other. Maybe it's there and I'm  just not finding it . Thanks.

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @fredclown,

the choose to have Hot/Warm and cold buckets in the same or in different storages depends on how many data you have to store and if you can pay a quick store also for cold buckets.

In other words: you must have a quick storage (at least 800 IOPS, better 1200) for Hot and Warm Buckets, that usually cover the 85-90 % of your searches.

Usually the retention of Hot and Warm data is max one month, to limit the costs for the storage.

Then for Cold buckets, that are usually used for few searches, you can use a less performant (and less expensive) storage also because the retention of Cold buckets is usually more long (6/12 months or more).

The issue of performace in moving buckets from Warm to Cold, isn't a problem, becaus isn't a so frequent action.

In conclusion: if you have money to waste, you can use performant storage for all your data, but probably it's better to invest your money to buy a very performant storage for Hot and Warm buckets storage, and using a less performat storage for Cold buckets.

Ciao.

Giuseppe

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @fredclown,

the choose to have Hot/Warm and cold buckets in the same or in different storages depends on how many data you have to store and if you can pay a quick store also for cold buckets.

In other words: you must have a quick storage (at least 800 IOPS, better 1200) for Hot and Warm Buckets, that usually cover the 85-90 % of your searches.

Usually the retention of Hot and Warm data is max one month, to limit the costs for the storage.

Then for Cold buckets, that are usually used for few searches, you can use a less performant (and less expensive) storage also because the retention of Cold buckets is usually more long (6/12 months or more).

The issue of performace in moving buckets from Warm to Cold, isn't a problem, becaus isn't a so frequent action.

In conclusion: if you have money to waste, you can use performant storage for all your data, but probably it's better to invest your money to buy a very performant storage for Hot and Warm buckets storage, and using a less performat storage for Cold buckets.

Ciao.

Giuseppe

isoutamo
SplunkTrust
SplunkTrust

Hi

Currently in many cases especially in cloud environment, the best practices is to use SmartStore for all other than hot buckets. Then there haven’t been any cold buckets on node and even warm buckets are on SmartStore (like AWS S3). All local buckets hot + cache are usually on NVME disk cache.

r. Ismo

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...