Alerting
Highlighted

How to search a moving average of Errors for a 2 hour time period and alert if the number of errors goes 10% higher than the average?

SplunkTrust
SplunkTrust

I have an index which has around 600,000 events per day. Each day between 12am-2am, we get a lot of errors due to maintenance on the website. I want to find the average number of errors for each day between 12am-2am and create an alert which will go off if at anytime the number of errors goes 10% higher than the average.

Here's a similar thread I made about 2 weeks ago, but this will be slightly different as I don't want the average errors from the whole day but rather the average errors from 12am-2am

http://answers.splunk.com/answers/294662/how-to-set-an-alert-on-a-moving-average.html

Here's my current search

index=vertex7-access RTG_Error="500" | bucket _time span=1d  | timechart count | where count >  1.1 * [ search index=vertex7-access RTG_Error="500" earliest=-7d@d latest=@d | bucket _time span=1d | stats count by _time | stats avg(count) as AvgDailyError500Count | return $AvgDailyError500Count ]
0 Karma
Highlighted

Re: How to search a moving average of Errors for a 2 hour time period and alert if the number of errors goes 10% higher than the average?

Esteemed Legend

Like this:

index=vertex7-access RTG_Error="500" | eval date_hour = strftime(_time, "%H") | where date_hour>=0 AND date_hour<2 | bucket _time span=1d  | timechart count | where count >  1.1 * [ search index=vertex7-access RTG_Error="500" earliest=-7d@d latest=@d | eval date_hour = strftime(_time, "%H") | where date_hour>=0 AND date_hour<2 | bucket _time span=1d | stats count by _time | stats avg(count) as AvgDailyError500Count | return $AvgDailyError500Count ]

View solution in original post

Highlighted

Re: How to search a moving average of Errors for a 2 hour time period and alert if the number of errors goes 10% higher than the average?

SplunkTrust
SplunkTrust

Just alternative approach with single search (may work faster)

 index=vertex7-access RTG_Error="500" earliest=-7d@d latest=now() | eval date_hour=strftime(_time, "%H") | where date_hour>=0 AND date_hour<2 | timechart span=1d count | eval Day=if(_time<relative_time(now(),"@d"),"PreviousWeeks","Today") | eval temp=1| chart avg(count) as avg over temp by Day | where Today> PreviousWeeks*1.1