Hi I'm trying to be certain I understand the process of moving to Frozen - ie deleted in our case..
I have added some local settings in indexes.conf such that our frozenTimePeriodInSecs equates to 28 days and maxHotSpanSecs equates to 21 days generically so that we definitely have data at least in Warm state that would need freezing - we have 2 indexes which this should explcitly impact.
Reading through the various documentation it says that to see if data IS moving to Frozen to check for BucketMover in Splunkd logs. When I check that I can see a good number of entries, but I am surprised to see that it's not every day - is that because the whole of a warm or cold bucket has to fall into the criteria set in frozentimePeriodInSecs?
For the day I initially set the amendments up (15th May) I can see that BucketMover successfully froze a number of our NMON index buckets, but since then as far as I can tell the only thing that's been frozen by the bucketMover is the _introspection index and that was on 20th May - previously this had only been frozen 6th & 7th, 13th, 14th and 15th May.
The primary reason for checking this is that I have planned our retention policy based on calculated data usage per day and available filesystem space and it's getting close to the configured limit - so before making any other changes I need to be certain I understand.
We're running Splunk version 6.2.2 on RHEL 2.6
one other question - I've set Pause indexing if free disk space (in MB) falls below to 1GB, but have read some posts that there should be ~5GB to enable the freezing. I don't think that's the case as I am seeing some freezing activity going on. I'm limited for this Proof of concept to 20GB so don't want reserved space too big.. What's the minimum it can sensibly be?
The most important thing to remember is that freezing (and other transitions from hot to warm to cold) operate strictly on entire buckets.
You are correct, for a bucket to qualify to be frozen based on event time the entire bucket needs to be past the frozenTimePeriodinSecs. Otherwise you would be removing data that doesn't yet qualify to be removed. With your settings a bucket could be created up to 21 days wide, that bucket would not freeze until 28 days after the last event or 49 days after the first event in the bucket.
Now to configure freezing based on size of your index, you should consider two other primary settings around retention, namely maxDataSize and maxTotalDataSizeMB also in indexes.conf. The former defines how large in bytes a bucket can be, and the latter is how large the entire index can be. When Splunk goes to make a new hot bucket, if there is not approximately maxDataSize under maxTotalDataSizeMB, then the oldest bucket is frozen. (Like the others these too are set on an index by index basis, and have default values)
Now regarding the question in your comment, that setting doesn't really apply here. If your free disk space reaches that level or less, new events stop coming in (potentially being dropped if your queues fill up). It doesn't trigger freezing, but of course when buckets qualify to be frozen it could release new events to come in by freeing disk space)
Thanks acharlieh, I've just checked and on our largest index - nmon - it looks like I have on average about 2-3 warm buckets per day (I'm taking it that a bucket is a complete db_ directory ie db14299315231429802386_18) - which surprised me as the maxDataSize is set to Auto which according to the docs means approx 750MB - I presume there are a lot more factors than just this that go into the size of a hot bucket. But I think based on that discovery that I should be ok and I should see some freezing of those tomorrow as they do only stretch back as far as 22nd April ie 28 days ago.. And then I suppose it will happen daily for at least those buckets where I have more than one per day.
Yep, there are a number of other factors that can affect rolling from hot to warm. Most commonly, if Splunk is restarted, then all hot buckets are rolled to warm. Another possible scenario is that there is a setting for the max number of open hot buckets, if you're taking in data with disparate enough timestamps Splunk could decide that it needs another hot bucket and if you already have (I think the default is three) then one of them gets rolled to warm. And many other possibilities as well 🙂