Splunk Enterprise

Indexes tiering age-based (not depend from size)

bizza
Path Finder

Hi,
there is a way to roll data from hot to warm, and from warm to cold, per index, using the data's age?

I found only

frozenTimePeriodInSecs 

for the age, other parameters are based on size.

Regards

bizza

1 Solution

bwooden
Splunk Employee
Splunk Employee

There is a way to do this - but I would not encourage its use as it may unintentionally impact search performance. It is typically more performant to tell Splunk how much storage it may use per volume (or per index where different types of data have different retention requirements). Splunk does a good job figuring out what data to put in which bucket based on time. It is not usually beneficial to purge data in an index based on its age because we must first force Splunk to bucket this data based on our calculations (which are not usually optimal).

Further attempt at persuasion: It is generally okay to have more data than required. If only 6 months of data is needed but 9 months is available, Splunk will still return data quickly based on its underlying time series index. I say this to further discourage anyone from aging by time without careful consideration and planning.

If an index must be manipulated to discard data by age to meet a requirement, the maxHotIdleSecs setting would be used. Let us say a business rule demands we drop data older than 2 months and your maxHotBuckets is set to 1. First, set maxHotIdleSecs to one day. Important: In this example, ensure only one bucket per day by size (max) is created or less than 2 months of data will be retained. Next, set maxWarmDBCount to 59. This configuration will create a new hot bucket each day and keep 59 warm buckets (each ostensibly having a day's data). Splunk will then roll data older than 60 days to cold. Now set frozenTimePeriodInSecs to 60 days so that data rolled to cold is frozen. Note: While these settings were described in days but are represented as seconds in the .conf

NB: While creating a new index with this configuration is one thing, applying these settings to an existing index is something more serious. Please be careful if considering these settings for a production environment. Consulting Support before implementing an impactful configuration is strongly encouraged.

View solution in original post

bwooden
Splunk Employee
Splunk Employee

There is a way to do this - but I would not encourage its use as it may unintentionally impact search performance. It is typically more performant to tell Splunk how much storage it may use per volume (or per index where different types of data have different retention requirements). Splunk does a good job figuring out what data to put in which bucket based on time. It is not usually beneficial to purge data in an index based on its age because we must first force Splunk to bucket this data based on our calculations (which are not usually optimal).

Further attempt at persuasion: It is generally okay to have more data than required. If only 6 months of data is needed but 9 months is available, Splunk will still return data quickly based on its underlying time series index. I say this to further discourage anyone from aging by time without careful consideration and planning.

If an index must be manipulated to discard data by age to meet a requirement, the maxHotIdleSecs setting would be used. Let us say a business rule demands we drop data older than 2 months and your maxHotBuckets is set to 1. First, set maxHotIdleSecs to one day. Important: In this example, ensure only one bucket per day by size (max) is created or less than 2 months of data will be retained. Next, set maxWarmDBCount to 59. This configuration will create a new hot bucket each day and keep 59 warm buckets (each ostensibly having a day's data). Splunk will then roll data older than 60 days to cold. Now set frozenTimePeriodInSecs to 60 days so that data rolled to cold is frozen. Note: While these settings were described in days but are represented as seconds in the .conf

NB: While creating a new index with this configuration is one thing, applying these settings to an existing index is something more serious. Please be careful if considering these settings for a production environment. Consulting Support before implementing an impactful configuration is strongly encouraged.

bizza
Path Finder

thank you bwooden, I'll plan a tiering based on data.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...