Getting Data In

Programmatically detect age of oldest record to be sure splunk isn't discarding old data too soon?

tgfurnish
Engager

I'm looking for suggestions on the best way to programmatically check the age of the oldest record in an index. If I can figure out a light-weight way to do this, then I'd wrap it into a Nagios check and run that check a few times per day to validate that I'm not deleting data too quickly within Splunk. Any suggestions?

I'm looking to do this because despite my settings, Splunk's sometimes unexpectedly deleting data too quickly -- I think I have everything configured to use up to 1.6 TB of storage (and I'm using less than half that) and keep data for 18 months (frozenTimePeriodInSecs = 48211200), but in practice sometimes Splunk is nuking data at three months. Worse, I have to keep manually checking a search of old data just to detect that the problem is happening.

If I can automate the checking, then I can spot this more quickly and hopefully limit the damage. Any thoughts?

0 Karma

maciep
Champion

to not answer your question first, you may want to revisit how you configured your indexes. Data will roll to frozen by both size and time. So yes, frozenTimePeriodInSecs will roll when a bucket hits that age..but if your index grows larger than the max size for the index, the oldest bucket will also roll to frozen. So you may just need to adjust maxTotalDataSizeMB for and index if buckets are rolling to soon, because maybe the index isn't sized appropriately?

that said, you should be able to use the REST API to get the earliest event in an index

standalone: services/data/indexes/
cluster master (maybe this or maybe in bucket data): services/cluster/master/indexes/

adonio
Ultra Champion

or | dbinspect or maybe something like that: |tstats min(_time) as first_event where index=* by index
if more than 1 indexer add: splunk_server to verify indexers alignment

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...