We monitor a system that has tasks. One of the metrics we monitor is failed tasks. It is a value that can only go up until the system is rebooted. E.g. the current value is 1162, if another tasks fails, it will be 1163.
Users want to be alerted if the value increases (goes from 1162 to 1163) and want to keep receiving alerts until the events are acknowledged. After acknowledgement, users no longer want to receive any alerts until the value increases again.
We created an alert that compares the previous value, but the alert stops after the initial increase.
Any help would be greatly appreciated!
You can try adding the following lines to $SPLUNK_HOME/etc/apps/SA-ITOA/local/itsi_notable_event_status.conf:
[2]
end=1
And set "If the number of events in this episode is greater than or equal to 1163" then send alerts in Aggregation Policy Action Rules page. Then you will keep getting alerts until the episode is ack'ed.
Refer to https://docs.splunk.com/Documentation/ITSI/4.3.1/Configure/itsi_notable_event_status.conf