Alerting

How to search recent alerts fired by Splunk?

Contributor

I'd like to build a "Recent Alerts" report listing which alerts have been fired by Splunk in the last few days.

When a splunk alert is fired, I'm assuming there's an event written somewhere in the _internal index which I can use for this. Anyone know what search query I should be using to pull out these events from Splunk's internal logs?

BTW, this report will be useful for several reasons, including:

  • troubleshooting problems with our alert scripts (e.g. alert fired but alert script didn't do what we thought it should)
  • giving folks outside the ops team a view into the problems the ops team is working on
  • providing a failsafe option if email is down
2 Solutions

Splunk Employee
Splunk Employee

4.0 doesn't have terribly good log events for alerting. You can see that the search was run, but not that it was run by the scheduler, so you cannot differentiate between manually-initiated and schedule-initiated searches. You can see the python event if the search eventually fires the email sending command sendemail.py, but that only will catch searches whose conditions were met, and which were configured to send email.

In 4.1, all scheduled searches are explicitly logged, as well as the result (conditions met / not met). If a search would have run but was not for some reason, this is also logged. There are some built-in status views that try to give useful reporting on this data, but you can build your own slicings of it.

View solution in original post

Motivator

I use the following search on the _internal index in version 4.1+ to report on alerts that have been triggered:

index="_internal" sourcetype="scheduler" thread_id="AlertNotifier*" NOT (alert_actions="summary_index" OR alert_actions="")

I am excluding summary_index alert actions since I am only interested in "real" alerts and not summary index searches. You can easily build a report based on the results of this search. Especially if you use splunk for PCI compliance having a report showing all alerts fired over a period of time will go a long way to help you satisfy the daily log review requirement.

View solution in original post

Communicator

You can also use the following search:

index=_audit action=alert_fired

which has the added benefit of giving you the expiration time and the severity. For example, you could create a report of the currently active alerts like this:

index=_audit action=alert_fired | eval ttl=expiration-now() | search ttl>0 | convert ctime(trigger_time) | table trigger_time ss_name severity

Motivator

I use the following search on the _internal index in version 4.1+ to report on alerts that have been triggered:

index="_internal" sourcetype="scheduler" thread_id="AlertNotifier*" NOT (alert_actions="summary_index" OR alert_actions="")

I am excluding summary_index alert actions since I am only interested in "real" alerts and not summary index searches. You can easily build a report based on the results of this search. Especially if you use splunk for PCI compliance having a report showing all alerts fired over a period of time will go a long way to help you satisfy the daily log review requirement.

View solution in original post

Splunk Employee
Splunk Employee

4.0 doesn't have terribly good log events for alerting. You can see that the search was run, but not that it was run by the scheduler, so you cannot differentiate between manually-initiated and schedule-initiated searches. You can see the python event if the search eventually fires the email sending command sendemail.py, but that only will catch searches whose conditions were met, and which were configured to send email.

In 4.1, all scheduled searches are explicitly logged, as well as the result (conditions met / not met). If a search would have run but was not for some reason, this is also logged. There are some built-in status views that try to give useful reporting on this data, but you can build your own slicings of it.

View solution in original post