I have a question about bucket rotation and the number of files in a bucket.
Here are our the settings for the index=main.
homePath = /splunkidx/defaultdb/db
coldPath = /splunkidx/defaultdb/colddb
thawedPath = /splunkidx/defaultdb/thaweddb
maxDataSize = auto_high_volume
maxTotalDataSizeMB = 400000
maxHotSpanSecs = 86400
frozenTimePeriodInSecs = 2592000
maxWarmDBCount = 30
The goal was to have 30 days worth of data (give or take a day). So with 86400=1 day that tells me a hot bucket should stay around 1 day, then roll to warm. With maxWarmDBCount=30 that says stuff stays in warm for 30 days and then rolls to cold. FrozenTimePeriodInSecs=259200=30 days, so data should get deleted "about" every 30 days. Worst case it sticks in cold for 30 days which means we really might have something like 60 days worth of data. That still tells me we would have something close to 30-60 files in colddb. We recently ran into a problem where there was 32,000 files in the colddb folder which is a Linux system limit and caused issues with buckets not rolling from warm to cold. How does this happen? Data coming in with bad date/time stamps (with old dates older than 30-60 days)?
Looking at the docs for indexes.conf I see:
maxHotSpanSecs = positive integer
- Upper bound of timespan of hot/warm buckets in seconds. - Defaults to 7776000 seconds (90 days). - NOTE: If you set this too small, you can get an explosion of hot/warm buckets in the filesystem. - If you set this parameter to less than 3600, it will be automatically reset to 3600, which will then activate snapping behavior (see below). - This is an advanced parameter that should be set with care and understanding of the characteristics of your data. - If set to 3600 (1 hour), or 86400 (1 day), becomes also the lower bound of hot bucket timespans. Further, snapping behavior (i.e. ohSnap) is activated, whereby hot bucket boundaries will be set at exactly the hour or day mark, relative to local midnight. - Highest legal value is 4294967295
maxHotIdleSecs = nonnegative integer
- Maximum life, in seconds, of a hot bucket. - If a hot bucket exceeds maxHotIdleSecs, Splunk rolls it to warm. - This setting operates independently of maxHotBuckets, which can also cause hot buckets to roll. - A value of 0 turns off the idle check (equivalent to infinite idle time). - Defaults to 0. - Highest legal value is 4294967295
Should we be looking at using maxHotIdleSecs?
My questions are:
2) do not mess with anything other than
frozenTimePeriodInSecs - just leave them at the defaults.
I agree with Kristian. I would remove
maxHotSpanSecs = 86400 This is not something that you should normally set. Plus, this setting is not based on the time when the data is indexed, it is based on the actual timestamp of the events. So events with different dates would be in different buckets, even if they arrived at the indexer at the same time.
Since you have set
frozenTimePeriodInSecs, you should not need to do anything else.
However, if you want to fine tune further, you could set
maxDataSize to the size of a single day's data (but not less than 750 MB). This would not guarantee a bucket per day, because Splunk optimizes the placement of data in buckets to speed searching. Also, Splunk is usually working with multiple hot buckets simultaneously. But it might help. You can set
maxHotIdleSecs, but I would not set it lower than 86400.
A hot bucket rolls to warm when (1) Splunk restarts (2) it fills (3) it receives no new data for
maxHotIdleSecs (4) Splunk needs to open a new bucket, and it is the oldest open bucket and (5) maybe other reasons that I don't know about... My guess in this case is that
maxHotSpanSecs caused the problem. Data arriving with inconsistent timestamps could certainly be part of the problem, too.