New to Splunk.
Trying to watch an application for abnormal response time behavior and I can't get the alert to trigger. I am obviously doing something wrong here.
Here's the search string I'm using:
host=MyHost* GET campaignID="*" clientID="*" memberID="*"| bucket _time span=1m | stats avg(time_taken) as avg by _time | where avg > 30
I open it in search and the search (all time) returns chunks of 5 minute results back to the software initial release. (returned 9k matches on 4M events).
I had the alert scheduled on cron
*/5 * * * * and I have tried both
number of results
greater than 0
Trigger Condition Custom
where avg(time_taken) > 30
Neither of them worked to actually generate email (email path is definitely working), or show up in the Triggered Alerts list.
So my actual question's are 2:
1. Is this the right way to get the result I want, which is basically to check every 5 minutes to see if the average GET response is higher than 30 seconds.
2. Once I (we) figure out how to trigger the alert, is it going to blast me with results from all time, or just the last 5 minutes?
Based on your search it looks like you want to know if the average response time is greater than 30 seconds for any 1 minute slice of time. Assuming that's correct, your search should work. If you lower the threshold (i.e. "where avg > 30" to something lower (whatever is typical in your environment), does it generate an e-mail?
Also, what time range are you using in your saved search? If it's all-time and you are running it every 5 minutes, that could be why it's not triggering. I would add in earliest=-5m latest=now to your search in addition to scheduling it to run every 5 minutes.
"Number of results greater than 0" should be a sufficient alert trigger.
As for blasting you with results, you can configure Splunk to throttle alerts. The alerting manual has pretty good documentation on this: http://docs.splunk.com/Documentation/Splunk/6.2.1/Alert/Aboutalerts
Another thing you can try... Does this search generate any alerts? (This is just taking an average of the entire five minute window and returning a single value, rather than by minute, then searches for averages above 1 second)
host=MyHost* GET campaignID="*" clientID="*" memberID="*" earliest=-5m latest=now | stats avg(time_taken) as avg | search avg > 1
I have a question on how to set a cron for an alert to have it trigger every 10 mins within a time.
For an example, if I want the alert to be triggered every 10 mins from 3 Am to 9 PM only. I tried this but it doesn't seems to be working as expected.
*/10 03-23 * * *