How to create an alert that Detects anomalies or o...

shashank_24 · ‎05-10-2022

Hi, I have few alerts created which looks into failure rates of my services and I have put in a condition which says if the failure rate is > 10% AND number of failed request > 200 then trigger the alert.

This is really not the ideal way to do the monitoring. Is there a way in Splunk we can use the AI to detect anomalies or outliers over time? So basically if Splunk can detect a failure pattern and if that pattern is consistent then don't trigger an alert but if it goes beyond the threshold, only then trigger it?

Can we do this kind of stuff in Splunk using in-built ML or AI?

bowesmana · ‎05-10-2022

Take a look at the ML toolkit - there are some good examples on outliers there - you can also roll your own, e.g. this type of search will look for hourly outliers outside 3 * stdev

search error
| bin _time span=1m
| stats count by _time
| streamstats window=60 avg(count) as avg stdev(count) as stdev
| eval multiplier = 3
| eval lower_bound = avg - (stdev * multiplier)
| eval upper_bound = avg + (stdev * multiplier)
| eval outlier = if(count < lower_bound OR count > upper_bound, 1, 0)
| table _time count lower_bound upper_bound outlier
Results Example

How to create an alert that Detects anomalies or outliers in Splunk?

administration

using Splunk Enterprise

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation