Getting Data In

Programmatically detect age of oldest record to be sure splunk isn't discarding old data too soon?


I'm looking for suggestions on the best way to programmatically check the age of the oldest record in an index. If I can figure out a light-weight way to do this, then I'd wrap it into a Nagios check and run that check a few times per day to validate that I'm not deleting data too quickly within Splunk. Any suggestions?

I'm looking to do this because despite my settings, Splunk's sometimes unexpectedly deleting data too quickly -- I think I have everything configured to use up to 1.6 TB of storage (and I'm using less than half that) and keep data for 18 months (frozenTimePeriodInSecs = 48211200), but in practice sometimes Splunk is nuking data at three months. Worse, I have to keep manually checking a search of old data just to detect that the problem is happening.

If I can automate the checking, then I can spot this more quickly and hopefully limit the damage. Any thoughts?

0 Karma


to not answer your question first, you may want to revisit how you configured your indexes. Data will roll to frozen by both size and time. So yes, frozenTimePeriodInSecs will roll when a bucket hits that age..but if your index grows larger than the max size for the index, the oldest bucket will also roll to frozen. So you may just need to adjust maxTotalDataSizeMB for and index if buckets are rolling to soon, because maybe the index isn't sized appropriately?

that said, you should be able to use the REST API to get the earliest event in an index

standalone: services/data/indexes/
cluster master (maybe this or maybe in bucket data): services/cluster/master/indexes/

Ultra Champion

or | dbinspect or maybe something like that: |tstats min(_time) as first_event where index=* by index
if more than 1 indexer add: splunk_server to verify indexers alignment

0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out >> As our brave ...