We are in the process of building out a whole new Splunk environment. As a result we are trying to be thoughtful about every piece of the new environment to make it as efficient as possible.
One question I have about the hot/warm and cold storage is should these be on physically different volumes? I guess one advantage I see about having them on the same volume is that when the buckets roll to cold the data doesn't have to be moved to a different volume thus saving some speed there. However, I also want to consider read/write contention and having separate volumes means that the cold reads wouldn't interfere with the read/writes on the hot/warm volume. My gut tells me to do separate volumes but I've not seen anything in the docs recommending one or the other. Maybe it's there and I'm just not finding it . Thanks.
Hi @fredclown,
the choose to have Hot/Warm and cold buckets in the same or in different storages depends on how many data you have to store and if you can pay a quick store also for cold buckets.
In other words: you must have a quick storage (at least 800 IOPS, better 1200) for Hot and Warm Buckets, that usually cover the 85-90 % of your searches.
Usually the retention of Hot and Warm data is max one month, to limit the costs for the storage.
Then for Cold buckets, that are usually used for few searches, you can use a less performant (and less expensive) storage also because the retention of Cold buckets is usually more long (6/12 months or more).
The issue of performace in moving buckets from Warm to Cold, isn't a problem, becaus isn't a so frequent action.
In conclusion: if you have money to waste, you can use performant storage for all your data, but probably it's better to invest your money to buy a very performant storage for Hot and Warm buckets storage, and using a less performat storage for Cold buckets.
Ciao.
Giuseppe
Hi @fredclown,
the choose to have Hot/Warm and cold buckets in the same or in different storages depends on how many data you have to store and if you can pay a quick store also for cold buckets.
In other words: you must have a quick storage (at least 800 IOPS, better 1200) for Hot and Warm Buckets, that usually cover the 85-90 % of your searches.
Usually the retention of Hot and Warm data is max one month, to limit the costs for the storage.
Then for Cold buckets, that are usually used for few searches, you can use a less performant (and less expensive) storage also because the retention of Cold buckets is usually more long (6/12 months or more).
The issue of performace in moving buckets from Warm to Cold, isn't a problem, becaus isn't a so frequent action.
In conclusion: if you have money to waste, you can use performant storage for all your data, but probably it's better to invest your money to buy a very performant storage for Hot and Warm buckets storage, and using a less performat storage for Cold buckets.
Ciao.
Giuseppe
Hi
Currently in many cases especially in cloud environment, the best practices is to use SmartStore for all other than hot buckets. Then there haven’t been any cold buckets on node and even warm buckets are on SmartStore (like AWS S3). All local buckets hot + cache are usually on NVME disk cache.
r. Ismo