I was wondering if it was possible to set up an alert to be something like – If there is a "errorcode=800" spike over X threshold for X minutes, trip the alert.
Like, if it’s a prolonged spike that doesn’t go away or climbs in volume/frequency. Thought I’d ask if this type of alert is even possible…?
Spike is like more than 30 errors and continue increasing for next 5 or more than 5 minutes
Create a report which finds what you are looking for (assuming it has already happened) then save this as an alert.
The things you might want to do in your report is count the number of errors in 1 minute buckets over the last 5 minutes. Then determine whether the counts reach the threshold (x). Then count how many of the 5 minute buckets are over the threshold and use this result to trigger the alert if over threshold (y)
index = blah ...
| bin _time span=1m
| stats count(eval(error="800")) as errors by _time
| where error > x
| stats count as minutes
| where minutes > y