Knowledge Management

How to deal with datamodel retention period as summary range is not working

Path Finder

Hi Splunkers,

I have been using Splunk Enterprise Security. I have Network_Traffic datamodel running in my environment and summary range has been set "1 month". I can see more than one month difference when I subtract earliest(_time) from latest(_time). Below is the query how I found summary range explicitly.
| tstats summariesonly=true earliest(_time) as etime latest(_time) as ltime from datamodel=Network_Traffic | convert ctime(*time)

I could see summary range more than two months though it is set 1 month.

please help me how to set strict retention to avoid disk space issues.

Big Thanks in advance

Path Finder

Thanks Iguinn for your answer.

As far as I know, datamodel_summary is part of Index and datamodel_summary range can be set using acceleration.earliest_time=x and summaries will be created only for x period even Index retention period(frozeTimeInSeconds) is x+y.

Are datamodel_summary and Index retention period dependent? if yes, how to set strict policy to datamodel can only store events for x period.


Here is the deal - the retention setting on the index applies to buckets, not to individual events. A bucket cannot be removed from disk until all the events within that bucket are expired. So if you set retention to one month and your bucket is large and holds 3 months of data, then you will definitely have more than one month of data in your index.

You can only set strict retention rules in one of two ways: (1) 1 bucket = 1 hour of data, or, (2) 1 bucket = 1 day of data.
If you must, you can do this, but it will tend to make many small buckets (unless your daily volume is very high for the affected indexes). Many small buckets will cause your searches to run more slowly. I would avoid using strict retention to address your problem.

In order to get a balance between disk space management and search efficiency, you might want to set the bucket size for your indexes. Do this in addition to your retention setting. For each index, figure out how much disk space is consumed per day of data. Also consider a typical search range - you don't want to create too many buckets. In general, I personally try to follow these rules:
- size buckets to hold 24-48 hours of data, with the following exceptions:
- do not make buckets smaller than "auto," which is 750 MB
- do not make buckets larger than "auto_high_volume," which is 10GB
Of course, sometimes other factors come into play, like how often you want to back up the environment. And sometimes I will go a bit smaller. But these are good general starting points. Also, the bucket size that you set for the index is the approximate maximum bucket size; buckets can be smaller for a variety of reasons.

If you use the dbinspect command, it will show you a lot of cool information about the buckets in your index.

Finally, datamodel acceleration summaries also take space. So you might want to look at the space consumed by these. Changing the summarization options for the datamodel could lower the disk space (although this is usually less disk than the indexes themselves.)



@lguinn does this apply to datamodels retention too??

0 Karma


When you set data model acceleration, you are choosing the time range for acceleration. This timerange defines the "retention" for the acceleration data. It is independent of the index retention settings. But clearly, you can't retain the acceleration data longer than the index data!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...