All Apps and Add-ons

Alert if Value over threshold for a certain period of time

omprakash9998
Path Finder

Hi,

I have an event being received once every 2 minutes. I am trying to setup an alert if the Value for the event goes beyond certain threshold for 15 mins or more. I am using the below query.

index= x host = y 
|Where Value > Threshold
|sort _time
|bin _time span = 16m
| stats count by host _time
|Where count > 6
|Eval count = count *2

Does the above code need any changes to work.
Thanks in advance

0 Karma
1 Solution

aberkow
Builder

Some minor edits:

index= x host = y Value > Threshold # moved Value > Threshold up, you also probably want to filter to a very specific set of logs
|sort _time # why do you need the sort? Logs are already sorted _time descending by default
| bin _time span = 16m
| stats count by host, _time # added a comma for readability
|where count > 7 # shouldn't this be 7? you'd want all 8 2 minute chunks to be above the threshold
|eval count = count *2 # why do you need this line?

There are some other ways to do this (grabbing the earliest time of exceeded value, latest time, taking the diff). I would also urge you to get comfortable testing your alerts, in this case by lowering the threshold and seeing if, for example, a threshold of 0 returns the complete result set of all the hosts you would expect to see.

Hope this helps!

View solution in original post

aberkow
Builder

Some minor edits:

index= x host = y Value > Threshold # moved Value > Threshold up, you also probably want to filter to a very specific set of logs
|sort _time # why do you need the sort? Logs are already sorted _time descending by default
| bin _time span = 16m
| stats count by host, _time # added a comma for readability
|where count > 7 # shouldn't this be 7? you'd want all 8 2 minute chunks to be above the threshold
|eval count = count *2 # why do you need this line?

There are some other ways to do this (grabbing the earliest time of exceeded value, latest time, taking the diff). I would also urge you to get comfortable testing your alerts, in this case by lowering the threshold and seeing if, for example, a threshold of 0 returns the complete result set of all the hosts you would expect to see.

Hope this helps!

omprakash9998
Path Finder
|sort _time # this to make it easier for the application team to read the logs when they open the alert so that all the events are in ascending order.
 |bin _time span = 16m
 | stats count by host _time 
 |Where count > 7 # Yeah Should be 7 
 |Eval count = count *2 # only to display the number of minutes the value was above the threshold
| rename count AS "Minutes Over Threshold" host as Host
0 Karma

omprakash9998
Path Finder

Thank you for the help.
How would i go about grabbing the earliest time of exceeded Value and the latest time for the exceeded value and taking the difference.
thank you

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...