We want to configure an alert where the if there are continuous errors for more than 5 mins per app server per host then we need to trigger that. By continuous we mean every min in those 5 mins we have some error. How can i check that every one min in those 5 mins there was error and then trigger the alert?
You can try something like this (it's untested)..
index=... log_level=ERROR
| bin _time span=1m
| stats count by _time
| where count>0
| makecontineous count
It is using 5 spans with 1 minute per span. Its then checking to see if each span has a count value then using makeconineous to see if there's 5 in a row
@skoelpin where are we specifying that it should be non-zero for 5 continuous bins in the query?
Correct, this is why I added | where count>0
. I haven't tested this, but this will definitely get you started
This count is the for number of errors per min right? how to check if in last 5 spans all were > 0?