Hi everyone, I want to prevent warm buckets from becoming cold, not to disable it since it's mandatory to have coldPath. The reason is that since my Hot/Warm and Cold buckets are all on the same fast storage, as well as I also need to define maxVolumeDataSizeMB for coldPath, I want to use up all my storage for homePath as much as possible. Here's an example of what I mean.
Total Disk Space: 100GB
homePath's maxVolumeDataSizeMB: 90GB
coldPath's maxVolumeDataSizeMB: 10GB
I want to configure the indexes not to move to Cold buckets as much as possible, so that I can reduce the coldPath configuration to be 1GB only, hence freeing up 9GB of space to allocate for the homePath of my other 30 indexers.
From my Monitoring Console, I see that the coldPath is not used much so 9GB of all indexers add up to lots of space that are under-utilized. Based on this stats, I could set to 1GB today but it might suddenly increase one day, which leads to my question above, as I want to set it in a deterministic way. Any advice is appreciated.
Thanks for helping but I don't think it addresses my question unfortunately.
Even if I set maxTotalDatasizeMB to 100GB for my example above, how can I limit or prevent moving over to cold buckets if I want to set it to just 1GB? Another side question is how do I calculate how much storage to use for cold?
If you have warm and cold on the same storage, why would you even bother? The warm/cold distinction exists so you can move older (cold) data to a cheaper lower performance storage. There's no other reason for having or not having it - the searching doesn't change, you don't have to do anything differently just because buckets moved from warm to cold. So where's the problem?
It does bother me because if I set the coldPath's maxVolumeDataSizeMB to 10GB in my example above, which reserved 10GB of fast SSD storage for cold buckets, wouldn't it be a waste of fast storage if most of it is not being used by 30 indexers? If I set the coldPath's maxVolumeDataSizeMB too large, it's a waste of fast storage space but if I set it too low, the buckets will go into Frozen (deleted), hence my original question.
OK, I don't understand one thing. In the opening post you said that both hot/warm and cold storages are on the same physical device. So why bother creating two separate volumes from it especially if you don't want to set up separate usage limits? That confuses me, I must say 🙂
It's so that if I ever need to move the coldPath to slower storage in future, I can easily do so by just changing the path (e.g. from /fastdisk to /slowdisk). My volumes in Splunk are configured on the same fast storage as follows:
path = /fastdisk
path = /fastdisk
Because coldPath is mandatory, I must define a volume for it, which means I am "forced" to set a limit on it.
OK. I understand your point but if you wanted to move this storage to another place you could as well simply move the coldPath around since you will have to move the contents manually anyway. And it will be bothersome in clustered environment.
But in your case - well, you're effectively trying to ask splunk to do two opposite things - limit the data and don't limit the data. 😉 Splunk doesn't care that your both volumes are on the same physical storage so it applies the volume limit as it's supposed to.
What you could do - two possible approaches
1) Just set the "warm volume" limit to the whole 100GB and set "cold volume" limit to some negligible amount. Not a very pretty solution. Also the volume limits are supposed to be a "safeguards" for storage size, not a typical way of managing buckets life cycle. Especially if you want to have separate tiers for hot/warm and cold since exceeding volume size can immediately move buckets from hot to cold skipping warm completely. Depending on the activity in your indexes that might not be what you want
2) Set limits for homePath for indexes to a "normal" values and set limit for coldPath to some negligible amount. This way as soon as the bucket rotates to cold, it will be pushed out due to size limits (I'm not sure if it will be rolled out immediately or whether it would have to wait for another housekeeping cycle - by default 60 seconds).