My main index is stored in two locations depending on whether its a cold bucket or a hot/warm bucket. I set aside 400 GB of fast storage for the hot/warm buckets so with a maxDataSize of 10GB (auto_high_volume) plus 10 hot buckets I should set maxWarmDBCount to 30 right?
(30 warm buckets * 10GB) + (10 hot buckets * 10GB) = 400GB
The problem is that hot buckets are not reaching their 10GB limit so I end up having 30 warm buckets varying in size and consequently the 400GB are really never completely used up and is only half full with many 5GB buckets.
How can I make sure that buckets don't get rolled over until they reach the MaxDataSize limit?
The best approach is to specify a size for warm, and have a max bucket count in excess of what you would expect, so that the volume limit takes effect.
You can do this in two ways:
Set the homePath.maxDataSizeMB for the index
* Limits the size of the hot/warm DB to the maximum specified size, in MB.
* If this size is exceeded, Splunk will move buckets with the oldest value of latest time (for a given bucket)
into the cold DB until the DB is below the maximum size.
* If this attribute is missing or set to 0, Splunk will not constrain size of the hot/warm DB.
* Defaults to 0.
Set up a volume, and use that for hot.
# volume definitions; prefixed with "volume:"
path = /mnt/fast_disk
maxVolumeDataSizeMB = 100000
path = /mnt/big_disk
# maxVolumeDataSizeMB not specified: no data size limitation on top of the existing ones
path = /mnt/big_disk2
maxVolumeDataSizeMB = 1000000
# index definitions
homePath = volume:hot1/idx1
coldPath = volume:cold1/idx1
Buckets are not guaranteed to reach their maximum size; they can roll over 'early' for a number of reasons. I'd slightly prefer the second option, as it's easier if you have multiple indexes.