Deployment Architecture

Data Retention Policy

btshivanand
Path Finder

Hi Splunkers,

We defined 35 days retention for real time indexes in splunk.I see that retention are not happening strictly.Some of the old  events are not getting deleted .We have around 200 days data for some indexes .Also i see that for some indexes only January month data is present.I also cross checked with events time and indexed time. they are old data and not deleted due to retention policy.Now i have two concerns 

1)How to maintain strict 35 days retention's ?

2)Any idea why January month data is present and it is not getting deleted even though buckets are rolling out? and how they can be deleted with retention's.

Below are my settings for one index.

 

[rt]
# 80 GB a day / 14 days in warm / 35 day retention
homePath = volume:hot/rt/db
coldPath = volume:cold/rt/colddb
thawedPath = $SPLUNK_DB/rt/thaweddb
homePath.maxDataSizeMB = 500000
coldPath.maxDataSizeMB = 1000000
maxWarmDBCount = 300
frozenTimePeriodInSecs = 3024000
maxDataSize = auto_high_volume
maxTotalDataSizeMB = 150000 

 

Regards,Shivanand

Labels (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @btshivanand,

in Splunk, events are stored in buckets and retention is managed at bucket level not at event level.

In other words, a full bucket is deleted (or moved in a different folder) when the latest events exceeds the retention period.

For this reason you have buckets with events that exceed the retention period because in the same bucket there's at least one event still in the retention period.

Ciao.

Giuseppe

0 Karma

btshivanand
Path Finder

Thanks for the reply.

I see only jan month events for one of the index and there is no events are present till july.. I understand events are stored in the bucket and they will deleted once the buckets are rolled out.This is something strange i see.. how i need to get rid of this?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @btshivanand,

in my opinion, leave it alone and it will fix by itself.
As I told you, retention management is done at bucket level.


If you really want to take action, you could:

  • reduce the retention of that Index,
  • restart Splunk,
  • when the January events are cleared, reset the retention to the correct value,
  • and restart Splunk again.

But in this way there the risk to loose some events.

Ciao.

Giuseppe

0 Karma

btshivanand
Path Finder

Ok.. Better to leave as it is.. But any suggestion to maintain the retention for other index we have..They are also helding more number of data.    

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @btshivanand,

for the second question, if you have more data, probably you'll have less problems because the presence of more data has the consequence of a minor timerange (between the older and the newest events) for each bucket, that means that this problem will disappear.

ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...