Currently, we are trying to set up an alert for our AWS Instances to report if the CPU is >= 90%. What we want to have happen is once Splunk sees this, it will test two more times (waiting a shorter amount of time to check), then send out the actual alert. It will continue this pattern until the alert clears.
Example: Alert is scheduled on cron to run on the 45 minute mark of the hour, every hour. At 10:00am, Splunk sees that there is a server that is sitting at 91%. At this point, it would not send out an alert, but wait 5 minutes, checks again, showing it's still at 91%; but still does not send out the alert. On the third check, with another 5 minutes passing, and the results still the same, this is when Splunk would send out the alert to the requested email. This process would repeat until the alert clears.
I have found when trying to create an alert that there is the Throttle option; thinking that maybe if we set the time for every 45 minutes; once it sees the error, and is throttled for 10 minutes or so, after the throttle, the alert would be sent out, then go back and throttle again for another 10 minutes. (Please let me know if that makes sense, or if Throttle only suppresses immediately when active, but does not cause splunk to check again after the throttle has been engaged.)
How we did is - We are getting cpu data in every 5 minutes. So we scheduled alert every 6 minutes and see the average of last x (say 3) times. If the avg is greater than 90, send an alert.
I'm unsure if this will work out as you intend it to.
Splunk Alerts don't work that way. Also this is true:
but does not cause splunk to check again after the throttle has been engaged