I have set the frequency for an alert as 25 or more occurrences in 10 minutes if an exception, let's say "IllegalStateException", is found in 25 or more than 25 times in my log file within 10 minutes. If yes, then Splunk should generate an alert and an email has to be sent to the defined recipients.
Now according to above alert frequency condition:
If we have, let's say 150 occurrences of "IllegalStateException" in my log file within one hour (e.g. 5 exceptions after every 2 minutes), then Splunk should have generated 6 alerts and 6 emails have to be sent.
However, in our project we have received 2600+ alert emails with above conditions.
Can anybody explain that:
In above mentioned condition, will Splunk check for the exception in logs in the way like from let say
11:00 AM to 11:10 AM
11:10 AM to 11:20 AM
11:20 AM to 11:30 AM and so on
or it will check like:
11:00 AM to 11:10 AM
11:01 AM to 11:11 AM
11:02 AM to 11:12 AM and so on
Could there be any issue with our project's Splunk setup or if Splunk works like this?
An alert can trigger frequently based on similar results that the search returns. The schedule to run an alert can also cause the alert to trigger frequently. To reduce the frequency of the alert firing u can use throttling to reduce the frequency at which an alert triggers.configure these two parameters to reduce the frequency of search results....
1. Time period.....
In real time search time window is 60 seconds ...so alerts occuring may be frequent ....u can it suppress all successive alerts of the same type for the next 10 minutes by using throttling .....specify u required time
2. Field values ....you can also specify field value to supress alerts for that particular field......
hope it helps.........
I'm sure you probably have resolved this, but for posterity and other searchers, I'll write this in. If your answer was totally different from my suggestions, please feel free to write in your own answer and mark it as accepted!
Splunk should work OK like this, but a lot depends on the settings you actually used.
First, let me mention the docs for Alerting are fairly clear and if you ran through a few of those test scenarios and samples, I'm sure you'll be able to confirm whether it's just a wrong setting or if it actually doesn't work right.
I think what you have is a mix up of some sort. You may be running a search that returns all the hits for the past 10 minutes, but are running it real time and somehow have it set to send one alert for each result. So, for instance, when a new event comes in at 10:01:01 AM, it alerts for that event, then for EACH event in the previous ten minutes. Then at 10:01:13 another event comes in, so it triggers an alert for the one at 10:01:13 AM, and one for the event at 10:01:01 AM, and so on back for ten minutes. Essentially, you should be receiving approximately (events per time period)^2 events. It may not be that many, but I'm sure this is at least vaguely the problem.
If I were you, I'd create the search to run and return a
stats count like at the top section of the CreateScheduledAlerts docs. Set your earliest time to be
-5m to tell it to go back 5 minutes ago. Then trigger it on Number of Results greater than whatever abnormal threshold you'd like (more than 5? More than 0?) and use a cron schedule to make it run it every 5 minutes (that's the first example given in the section immediately following, so just copy/paste that in.)