So, I have data like this after I ran a query.
For each aggregator, if the aggregator_status is Error and before15 minutes, the aggregator_status becomes Up, alert should not run. But, if the aggregator_status is still Error or no new event comes, alert should trigger. The Time field is epoch time which I am thinking can be used to find difference in Up and Error status times.
How do I create such a query for the alert? I am thinking of using foreach command or some sort of streamstats, but I am unable to resolve this issue. The alert needs to run once every 24 hours.
It might be doable with the transaction command but it's usually not a good idea (transaction is a relatively "heavy" command and has its limitations).
I'd go with streamstats and reset_before, reset_after and time_window options. (can't give you a ready-made answer at the moment since I'm away from my Splunk environment but that's the way I'd try)
I tried something like this
index=abc ("Aggregator * is Error" OR "Aggregator * is Up") NJ12GC102
| rex field=_raw "Aggregator\s(?<aggregator>[^\s]+)\sis\s(?<aggregator_status>\w+)\s"
| streamstats current=t global=f window=2 range(_time) as time_diff by aggregator,aggregator_status
| streamstats current=t global=f window=2 range(_time) as time_diff2 by aggregator
| table _time aggregator aggregator_status time_diff time_diff2
|
But the output is now what I needed. For that I would need to change the window=2, but it brings more issues.
Try starting with something like this
| streamstats values(aggregator_status) as previous_aggregator_status by aggregator window=1 current=f global=f
| eval changetime=if((aggregator_status="Up" and previous_aggregator_status="Error") or (aggregator_status="Error" and previous_aggregator_status="Up"),_time,null())
| where isnotnull(changetime)
| streamstats current=t global=f window=2 range(_time) as time_diff2 by aggregator
| where aggregator_status="Error"