Deployment Architecture
Highlighted

Set Data Retention Policy to delete data on rolling basis- not entire index

Path Finder

Hello,

Below is information I received about data retention:

frozenTimePeriodInSecs sets the maximum age, in seconds, of data. Once
all of the events in an index bucket are older than this age, the
bucket will be frozen (default action: delete). The important thing
here is that the age of a bucket is defined by the newest event in
the bucket, and the event time, not the time at which the event
was indexed.

Is there any way to delete events based on the retention policy on a rolling basis? For example- I have an index that spans 3 years of data, and when the oldest data is 5 years old- I want to delete that data instead of waiting 8 years when the newest data is 5 years old- does that make sense?

0 Karma
Highlighted

Re: Set Data Retention Policy to delete data on rolling basis- not entire index

Communicator

Are you looking to delete data in this way just to reduce what is being scanned on search, or to save on disk space? Deleting data with the delete command outside of a bucket roll to frozen does not actually remove the events on disk. They are still there, just not referenced by anything or accessible by search.

That said, it's pretty unlikely that you have a single index bucket spanning 3 years of data. At the very least, your hot buckets would have rolled to warm and new hot buckets would have been created each time you've restarted Splunk over that time. Realistically, there's a bit more that will determine the number of buckets you have such as the size and data timespan, as shown here. So rather than waiting 8 years to delete the data, it might wait a couple months after that oldest event occurred.

0 Karma
Highlighted

Re: Set Data Retention Policy to delete data on rolling basis- not entire index

SplunkTrust
SplunkTrust

Hi katzr,

I think you have a wrong understanding how data retention in splunk works.
An index is comprised of countless buckets that store historical data from different time ranges. After the retention time has passed and all the data in a bucket is older than the retention period the data gets to the FROZEN state. Normally it get's deleted unless you are planning to do something else with it.

So you won't have any problems with what you were describing at any time.
If you are still eager to tackle even the most unlikely scenario you could set a parameter at the index in indexes.conf called maxHotSpanSecs

Example:

[main]
...
maxHotSpanSecs = 86399
...

Why set the parameter to 86399 instead of 86400?
It caused bugs in the past when it was set to exactly 86400. Don't know why or if still the case, but as a safety measure set it to 86399.

If you wanna know more about how this actually works out look here:
https://answers.splunk.com/answers/88457/difference-between-maxhotidlesecs-and-maxhotspansecs.html

0 Karma