I am testing the frozenTimePeriodInSecs setting, so I have edited my /opt/splunk/etc/system/local/indexes.conf
and added the following:
[default]
frozenTimePeriodInSecs= 180
and restarted the app. Immediately afterwards, I searched for index=_internal source=*splunkd.log BucketMover
and verified that the message AsyncFreezer freeze succeeded appears.
Then, I uploaded some logs in the main index and waited some time, but no new AsyncFreezer event has been executed again and the log information I loaded is still there. Even after the 180 seconds have elapsed.
My expectation is that the AsyncFreeze event is executed on a regular basis and the data recently uploaded is no longer available in the Search.
What am I missing?
TIA
As others have pointed out there are many tuneable parameters that you may consider setting in indexes.conf but I want to help you understand why you're seeing this behavior. The key quote from the frozenTimePeriodInSecs
parameter in the docs is:
IMPORTANT: Every event in the DB must be older than frozenTimePeriodInSecs before it will roll. Then, the DB
will be frozen the next time splunkd checks (based on rotatePeriodInSecs attribute).
So this means that the newest event in a particular bucket, must be more than 2 minutes old before the bucket (which could have plenty of older events as well... considering the default settings of maxHotSpanSecs
that could be 90 days worth of data) is frozen. Well... as data is being sent into Splunk, you have (one or more) hot buckets which are actively being written with data and the newest event is in that bucket could always be right now, as data can be added to that bucket as long as the hot bucket is open. The bucket could theoretically remain open infinitely with no new events coming in due to the default of maxHotIdleSecs
, but other constraints on hot buckets will close it sooner. When the hot bucket is closed and it "rolls to warm," the bucket can no longer have any new events added to it, and then Splunk would be able to determine if the time of the newest event is greater than frozenTimePeriodInSecs
and subsequently if the bucket with all of its events qualifies for deletion or not.
The reason you're seeing the freeze after restart, is that when Splunk restarts, all hot buckets automatically become warm buckets on restart (and thus could be frozen), but likely with such a small frozenTimePeriodInSecs, you're seeing data older than this timeframe because your hot buckets are still open, and could potentially have new events written into them.
As others have pointed out there are many tuneable parameters that you may consider setting in indexes.conf but I want to help you understand why you're seeing this behavior. The key quote from the frozenTimePeriodInSecs
parameter in the docs is:
IMPORTANT: Every event in the DB must be older than frozenTimePeriodInSecs before it will roll. Then, the DB
will be frozen the next time splunkd checks (based on rotatePeriodInSecs attribute).
So this means that the newest event in a particular bucket, must be more than 2 minutes old before the bucket (which could have plenty of older events as well... considering the default settings of maxHotSpanSecs
that could be 90 days worth of data) is frozen. Well... as data is being sent into Splunk, you have (one or more) hot buckets which are actively being written with data and the newest event is in that bucket could always be right now, as data can be added to that bucket as long as the hot bucket is open. The bucket could theoretically remain open infinitely with no new events coming in due to the default of maxHotIdleSecs
, but other constraints on hot buckets will close it sooner. When the hot bucket is closed and it "rolls to warm," the bucket can no longer have any new events added to it, and then Splunk would be able to determine if the time of the newest event is greater than frozenTimePeriodInSecs
and subsequently if the bucket with all of its events qualifies for deletion or not.
The reason you're seeing the freeze after restart, is that when Splunk restarts, all hot buckets automatically become warm buckets on restart (and thus could be frozen), but likely with such a small frozenTimePeriodInSecs, you're seeing data older than this timeframe because your hot buckets are still open, and could potentially have new events written into them.
Thanks for your kind answer. I hope you don't mind a quick follow up.
I added the following parameters to indexes.conf
[default]
frozenTimePeriodInSecs = 180
maxHotIdleSecs = 150
maxHotBuckets = 1
maxWarmDBCount = 1
rotatePeriodInSecs = 30
and restarted. Just as before, the data I had disappeared but the system didn't repeat the AsyncFreezer event after the first time. Following your comment I finally added one more parameter maxHotSpanSecs = 3600
and restarted once again. Finally I see that every our the AsyncFreezer event is triggered. However after I submitted new data and I could verify that the AsyncFreezer process has been executed several times in the upcoming hours, these events (located in the index main) are still visible via search and they have not been deleted.
If I restart splunk they will be gone but -as you can imagine- that's not what I am after.
Any ideas on what I am missing?
TIA,
I'll admit I haven't restricted indexes to such a level myself, however a start of what's happening here: As you know with Splunk are a bunch of folders where you could be creating an indexes.conf file with those contents. So I'm guessing that you're setting exactly that into $SPLUNK_HOME/etc/system/local/indexes.conf
. But how is that configuration being applied? This is where btool comes in handy.
Assuming a fresh install of 6.2.4, and your indexes.conf set where I described, when you run $SPLUNK_HOME/bin/splunk cmd btool indexes list main --debug
you'd see that only 4 of the 6 settings that you set are actually being applied to main, maxHotBuckets
and maxHotIdleSecs
are still coming from the default indexes.conf file. That's because if you peek into the default file, you'd see these settings are set specifically on the main index (as opposed to being set as defaults). The more specific resolved settings in the [main]
stanza override resolved settings in the general [default]
stanza. You can add another stanza with these specific settings for main in your local indexes.conf file, or change your default stanza to a main stanza and your settings will then all apply. (Don't change the indexes.conf in the default folder, that will get overwritten with every upgrade!)
Out of curiosity though... why are you trying to set such a tight retention policy anyways?
Once again, thanks for your answer. Your comment was right on the spot (but of course you knew this from the start) and the [main] stanza was taking precedence over the [default] values for those two parameters. I added [main] to my local indexes.conf and I am waiting for these events to be deleted when the hour elapsed. Let's see how it goes.
Regarding your question about this low threshold: it is just for testing purposes (the actual value will be in the order of weeks): the thing is that I have to dispatch the whole splunk already configured and I will not be able to fine tune the parameters once it is installed so I need to be 100% sure it will work. So far, for a expected retention value of two weeks I believe that the three parameters I really need to set are frozenTimePeriodInSecs, rotatePeriodInSecs and maxHotIdleSecs. Do you consider this to be a safe bet? (the other parameters I am touching are due to the low threshold).
Good news: it is working as intended. New data is gone away when the whole AsyncFreeze process is executed (once per hour, due to my maxHotSpanSecs = 3600
). Thanks again to all who answered.
Regarding my last comment in the previous answer: I have verified that maxHotIdleSecs is already present in the default indexes.conf for the main stanza with a value of 86400 so I will leave it that way. The one I believe I have to modify is maxHotSpanSecs since its default value of 90 is bigger than my expected window of two weeks. I have to be certain that the bucket is closed before the frozenTimePeriodInSecs has been reached in order the due archiving process can be processed.
Hi marplatense,
I was facing the same issue here .. Can you please share the indexes.conf you used. i am using the following . but no luck !!!
[my_index]
coldPath = $SPLUNK_DB/my_index/colddb
homePath = $SPLUNK_DB/my_index/db
thawedPath = $SPLUNK_DB/cold/my_index/thaweddb
maxHotSpanSecs = 300
frozenTimePeriodInSecs = 300
rotatePeriodInSecs = 30
repFactor = auto
Here i am jus testing with 5 min rolling window , but this doesnt work as expected until i restart.
According to indexes.conf you can't set it to lower then 3600 seconds.
Well ok you can .. but:
If you set this parameter to less than
3600, it will be automatically reset
to 3600, which will then activate
snapping behavior
Glad I could help @marplatense! I'm not sure if you're aware of this, but instead of using the "award points" link (which deducts from your answers karma) if you feel an answer, question, or comment is particularly good, you can use the upvote button (^) instead. Doing so awards karma to whomever helped you out, while keeping your karma intact, and helps with rankings of content too. I've awarded you the karma that you gave away in this manner.
To enable data retention with such low retention period, you would need to configure more properties as well. Following are those
frozenTimePeriodInSecs - Use your setting, only cold buckets older than this period will be rolled to frozen
maxHotIdleSecs - defaults to 0 (infinite), set it same or lower than frozenTimePeriodInSecs
maxHotBuckets - default to 3, set it to 1 to enable faster bucket rollover to warm bucket
maxWarmDBCount - defaults to 300, set it to smaller number, e.g. 1
rotatePeriodInSecs - defaults to 60, should be enough for your retention period but best to reduce it to 30 sec
Thanks in advance, I will test these values a.s.a.p. One question though: I am using the low value just for testing, in real application I will use a bigger value. My doubt is if these new settings you are recommending apply for all scenarios or just this one I am proposing?
TIA
You also need to set rotatePeriodInSecs which defaults to 60 seconds which will check if any events need to rotated to frozen. Potentially you could see 240 second delay before events are moved. Think of frozenTimePeriodInSecs as the a threshold and rotatePeriodInSecs as the frequency Splunk checks data for the threshold.