Hi Splunkers,
We defined 35 days retention for real time indexes in splunk.I see that retention are not happening strictly.Some of the old events are not getting deleted .We have around 200 days data for some indexes .Also i see that for some indexes only January month data is present.I also cross checked with events time and indexed time. they are old data and not deleted due to retention policy.Now i have two concerns
1)How to maintain strict 35 days retention's ?
2)Any idea why January month data is present and it is not getting deleted even though buckets are rolling out? and how they can be deleted with retention's.
Below are my settings for one index.
[rt]
# 80 GB a day / 14 days in warm / 35 day retention
homePath = volume:hot/rt/db
coldPath = volume:cold/rt/colddb
thawedPath = $SPLUNK_DB/rt/thaweddb
homePath.maxDataSizeMB = 500000
coldPath.maxDataSizeMB = 1000000
maxWarmDBCount = 300
frozenTimePeriodInSecs = 3024000
maxDataSize = auto_high_volume
maxTotalDataSizeMB = 150000
Regards,Shivanand
Hi @btshivanand,
in Splunk, events are stored in buckets and retention is managed at bucket level not at event level.
In other words, a full bucket is deleted (or moved in a different folder) when the latest events exceeds the retention period.
For this reason you have buckets with events that exceed the retention period because in the same bucket there's at least one event still in the retention period.
Ciao.
Giuseppe
Thanks for the reply.
I see only jan month events for one of the index and there is no events are present till july.. I understand events are stored in the bucket and they will deleted once the buckets are rolled out.This is something strange i see.. how i need to get rid of this?
Hi @btshivanand,
in my opinion, leave it alone and it will fix by itself.
As I told you, retention management is done at bucket level.
If you really want to take action, you could:
But in this way there the risk to loose some events.
Ciao.
Giuseppe
Ok.. Better to leave as it is.. But any suggestion to maintain the retention for other index we have..They are also helding more number of data.
Hi @btshivanand,
for the second question, if you have more data, probably you'll have less problems because the presence of more data has the consequence of a minor timerange (between the older and the newest events) for each bucket, that means that this problem will disappear.
ciao.
Giuseppe