I want to identify where the rate that an index's _indextime changes by a specific amount, with a tolerence that increases the faster the rate.
For example:
1. Index A - It indexes once every 6 hours and populates the past 6 hours of events. In this circumstance I would want to know if it hasn't indexed for 8 hours or more. The tolerance is therefore relatively small (around 30% extra).
2. Index B - It indexes every second, in this circumstance I may forgive it not indexing for a few seconds, but I'd definitely want to know if it hasn't indexed in 10 minutes. The tolerence is therefore relatively large.
I don't think _time is right to use, as that would retrospectively backfill the indexes and I'm thinking it'd give false results.
I feel that either the _internal index or tstats has the answer, but I've not yet come close.
For anyone else - the below search eventually worked the way I wanted although perhaps there is a more efficient way to do the same thing!
| tstats max(_indextime) as indextime WHERE earliest=-7d latest=now() index=* BY sourcetype index _time span=1h
```Look back over a 7 day window, and get the typical number of hours between indextimes, as well as the number of hours seen```
| sort 0 + index sourcetype indextime
| streamstats window=2 range(indextime) as range_indextime by sourcetype index
| eval range_indextime=range_indextime/60/60
| stats max(indextime) as last_indextime dc(indextime) as hour_count_over_5_days avg(range_indextime) as range_based_spacing by sourcetype index
| eval now=now()
| eval average_hour_spacing=120/hour_count_over_5_days
| eval hours_since_last_seen=if(isnotnull(hours_since_last_seen),hours_since_last_seen,abs((now-last_indextime)/60/60))
```Compare the time since we last saw indexes, and determine if it is likely late or not.```
| eval is_late=case(((range_based_spacing<=1 AND hours_since_last_seen>=1.5 AND average_hour_spacing<=1) OR (range_based_spacing<=6 AND hours_since_last_seen>=8 AND average_hour_spacing<=6) OR (range_based_spacing<=12 AND hours_since_last_seen>=15 AND average_hour_spacing<=12) OR (range_based_spacing<=24 AND hours_since_last_seen>=36) OR isnull(last_indextime)) AND hour_count_over_5_days>1,"yes",(hours_since_last_seen>24 AND hour_count_over_5_days<=1),"maybe",1=1,"no")
| eval last_indextime=strftime(last_indextime,"%Y-%m-%dT%H:%M")
| fields - now
For anyone else - the below search eventually worked the way I wanted although perhaps there is a more efficient way to do the same thing!
| tstats max(_indextime) as indextime WHERE earliest=-7d latest=now() index=* BY sourcetype index _time span=1h
```Look back over a 7 day window, and get the typical number of hours between indextimes, as well as the number of hours seen```
| sort 0 + index sourcetype indextime
| streamstats window=2 range(indextime) as range_indextime by sourcetype index
| eval range_indextime=range_indextime/60/60
| stats max(indextime) as last_indextime dc(indextime) as hour_count_over_5_days avg(range_indextime) as range_based_spacing by sourcetype index
| eval now=now()
| eval average_hour_spacing=120/hour_count_over_5_days
| eval hours_since_last_seen=if(isnotnull(hours_since_last_seen),hours_since_last_seen,abs((now-last_indextime)/60/60))
```Compare the time since we last saw indexes, and determine if it is likely late or not.```
| eval is_late=case(((range_based_spacing<=1 AND hours_since_last_seen>=1.5 AND average_hour_spacing<=1) OR (range_based_spacing<=6 AND hours_since_last_seen>=8 AND average_hour_spacing<=6) OR (range_based_spacing<=12 AND hours_since_last_seen>=15 AND average_hour_spacing<=12) OR (range_based_spacing<=24 AND hours_since_last_seen>=36) OR isnull(last_indextime)) AND hour_count_over_5_days>1,"yes",(hours_since_last_seen>24 AND hour_count_over_5_days<=1),"maybe",1=1,"no")
| eval last_indextime=strftime(last_indextime,"%Y-%m-%dT%H:%M")
| fields - now
This is what I have so far - but it seems way too complex! It does a baseline inner search to work out the average rate on the -48h -> -24h, and then joins that to the same search but -24h to now.
| tstats count WHERE earliest=-24h latest=now() index=* BY index sourcetype _indextime
| top limit=5 _indextime by index sourcetype
| streamstats range(_indextime) as range_indextime by sourcetype index
| stats avg(range_indextime) as observed_avg_range_indextime by index sourcetype
| join type=inner index sourcetype
[| tstats count WHERE earliest=-48h latest=-24h index=* BY index sourcetype _indextime
| top limit=5 _indextime by index sourcetype
| streamstats range(_indextime) as range_indextime by sourcetype index
| stats avg(range_indextime) as avg_range_indextime by index sourcetype]
You can do your calculations based on _indextime but still you have to select your data with _time. There is no other way with Splunk since _time is the primary "ordering field".
So you can do something like
index=whatever earliest=-8h
| stats max(_indextime)
to find out when the latest indexed event was indexed. You just need to get the initial timerange with a sufficient margin.
If I remember correctly, _indextime can be used with tstats as well (just not as a field with which you can bin with a given span).