Alerting

Create an alert that runs every five minutes and once triggered it gets throttled for the rest of the calendar day

Engager

So we want to create an alert that will run every 5 minutes, check the results returned by a query and if the results for the current calendar day exceed a specific limit it will cause the alert to trigger.

Once the alert triggers, we want to throttle the alert for the remaining of the day.

Is that possible?

Keep in mind that a rolling-window approach is not suitable for this problem.

Tags (1)
0 Karma
1 Solution

Champion

Hello,
Throttling won't work in your case as you are only considering the end of day not 24 hour interval.
So it should be controlled in search itself.

sourcetype=log4net source=Accounts "Account notification failed for " | eval date=strftime(_time, "%y-%m-%d") | stats dc(AccountId) as TotalNotificationsFailed by date|join source[|search sourcetype=log4net source=Accounts "Account notification failed for "|sort + _time |head 10|stats max(_time) as LatestEvtTime]|eval t=now()-300|where TotalNotificationsFailed > 10 AND  t < LatestEvtTime

Thanks

View solution in original post

Champion

Hello,
Throttling won't work in your case as you are only considering the end of day not 24 hour interval.
So it should be controlled in search itself.

sourcetype=log4net source=Accounts "Account notification failed for " | eval date=strftime(_time, "%y-%m-%d") | stats dc(AccountId) as TotalNotificationsFailed by date|join source[|search sourcetype=log4net source=Accounts "Account notification failed for "|sort + _time |head 10|stats max(_time) as LatestEvtTime]|eval t=now()-300|where TotalNotificationsFailed > 10 AND  t < LatestEvtTime

Thanks

View solution in original post

Engager

yep, makes sense. thanks

0 Karma

Champion

as your alert is trying to check every 5 minutes, i am only checking if the 10th error message time in falls under that 5 mins or not hence it is 300 seconds less than the current time. and if doesn't , don't trigger the alert again you need not set any other condition.

0 Karma

Engager

What does "t=now()-300" aim to achieve?

My understanding is that you are trying to figure out if the last trigger time is within the current day and if not trigger again?

0 Karma

Engager

Sure, consider the following:

sourcetype=log4net source=Accounts "Account notification failed for " | eval date=strftime(_time, "%y-%m-%d") | stats dc(AccountId) as TotalNotificationsFailed by date

Basically, if you got 10 failed notifications in a given calendar day the alert should fire off and then get suppressed for the remaining day. Currently it fires off, cron job every 5 minutes kicks in, and re-triggers.

The timewindow is @d to 'now'

0 Karma

SplunkTrust
SplunkTrust

While configuring throttling only your would get a text box labelled "Suppress the result...". See "http://docs.splunk.com/Documentation/Splunk/5.0.5/Alert/Defineper-resultalerts".

Would it be possible for your to provide the search you're executing?

0 Karma

Engager

By suppress you mean to enable Throttling? I tried that, set it for ie 5 minutes by the Date field. That worked but after 5 minutes the alert was retriggered.

0 Karma

SplunkTrust
SplunkTrust

Try this. Add a field say 'Date' in your results which will hold current calendar day. And then in triggering configurations, select "once per result" and in throttling configuration provide this 'Date' field as "Suppress the result with the same field value". And since this will be per result alert, modify your search to return just one row (which I believe you're doing anyways).

0 Karma