I'm looking for suggestions on the best way to programmatically check the age of the oldest record in an index. If I can figure out a light-weight way to do this, then I'd wrap it into a Nagios check and run that check a few times per day to validate that I'm not deleting data too quickly within Splunk. Any suggestions?
I'm looking to do this because despite my settings, Splunk's sometimes unexpectedly deleting data too quickly -- I think I have everything configured to use up to 1.6 TB of storage (and I'm using less than half that) and keep data for 18 months (frozenTimePeriodInSecs = 48211200), but in practice sometimes Splunk is nuking data at three months. Worse, I have to keep manually checking a search of old data just to detect that the problem is happening.
If I can automate the checking, then I can spot this more quickly and hopefully limit the damage. Any thoughts?
to not answer your question first, you may want to revisit how you configured your indexes. Data will roll to frozen by both size and time. So yes, frozenTimePeriodInSecs will roll when a bucket hits that age..but if your index grows larger than the max size for the index, the oldest bucket will also roll to frozen. So you may just need to adjust maxTotalDataSizeMB for and index if buckets are rolling to soon, because maybe the index isn't sized appropriately?
that said, you should be able to use the REST API to get the earliest event in an index
standalone: services/data/indexes/
cluster master (maybe this or maybe in bucket data): services/cluster/master/indexes/
or | dbinspect
or maybe something like that: |tstats min(_time) as first_event where index=* by index
if more than 1 indexer add: splunk_server
to verify indexers alignment