Problem resolution alert using splunk and send it ...

Amit_Sharma1 · ‎08-31-2025

Hi Splunkers,

I am working on an alert which calculates the error rate (> 30%)and send the alerts to pagerduty via API

index="test1" source="mylogs" NOT "TEST" earliest=-30m latest=now (":search-result match-indicator=\"PASS\" *******query*******
| stats count(id) AS OKStats
| appendcols [ search source=mylogs (":search-result match-indicator=\"ERROR\" *******query*******) NOT "TEST" earliest=-30m latest=now | stats count(id) AS ERRORStats]
| eval TotalTransactions = OKStats + ERRORStats
| eval ErrorRate = if(TotalTransactions > 0, round((DFAT_AP_ERR / TotalTransactions) * 100, 2), 0)
| where ErrorRate >= 30
| eval dedup_key="HighErrorRate"
| table ErrorRate, dedup_key

Now to clear the alert, I created another alert(<30%)

index="serverlogs" source="mylogs" NOT "TEST" earliest=-30m latest=now (":search-result match-indicator=\"PASS\" ************myqueryparams************
| stats count(id) AS OKStats
| appendcols [ search source=mylogs (":search-result match-indicator=\"ERROR\" ************myqueryparams************) NOT "TEST" earliest=-30m latest=now | stats count(id) AS ERRORStats]
| bin _time span=5m
| eval ErrorPercent = if((OKStats + ERRORStats) > 0, round(ERRORStats / (OKStats + ERRORStats) * 100, 2), 0)
| sort -_time
| streamstats window=2 latest(ErrorPercent) as latest_percent, latest(_time) as latest_time, earliest(ErrorPercent) as earliest_percent
| where latest_time = _time AND latest_percent < 30 AND earliest_percent >= 30
| head 1
| eval dedup_key="HighErrorRate"
| table latest_percent, earliest_percent, dedup_key

I created two conditions and sending to pagerduty(number of rows >1),running every 30min and enabled throttle. I do not see the second alert working or clearing the alert.

Any advice how to achieve the clearing of alerts which means the alert should be cleared on pagerduty. Currently creating python script is out of scope due to security reasons. Hence, was trying it via splunk query.

Regards,
Amit

livehybrid · ‎09-01-2025

Hi @Amit_Sharma1

In order to prevent the second search repeatedly firing when the threshold is not met, it needs some way of knowing about the previously triggered alert.

The way I would probably do this is to write to a lookup table as part of the first search which would set something that would indicate that the alert for high errors has fired, eg:

| eval isAlerting=1
| outputlookup myalert_status

Then in your second search check if the alert has previously fired, if it has then we would clear it:

index="serverlogs" source="mylogs"  NOT "TEST" earliest=-30m latest=now (":search-result match-indicator=\"PASS\" ************myqueryparams************
| stats count(id) AS OKStats
| appendcols [ search source=mylogs (":search-result match-indicator=\"ERROR\" ************myqueryparams************) NOT "TEST" earliest=-30m latest=now | stats count(id) AS ERRORStats]
| bin _time span=5m
| eval ErrorPercent = if((OKStats + ERRORStats) > 0, round(ERRORStats / (OKStats + ERRORStats) * 100, 2), 0)
| sort -_time
| streamstats window=2 latest(ErrorPercent) as latest_percent, latest(_time) as latest_time, earliest(ErrorPercent) as earliest_percent
| where latest_time = _time AND latest_percent < 30 AND earliest_percent >= 30
| head 1
| eval dedup_key="HighErrorRate"
| table latest_percent, earliest_percent, dedup_key
``` your search above ```
| appendcols [|inputlookup myalert_status]
| stats first(*) as *
| where isAlerting=1
``` Will not get past here if there is no previous alert fired ``` 
| eval isAlerting=0
| outputlookup myalert_status
``` Cleared the alert so it will not run multiple times ```

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Amit_Sharma1 · ‎09-08-2025

This worked with some tweaking and I see the alerts are getting cleared.

Now I am facing the issue how to clear this on pagerduty. I will see if the webhook allows me to clear it by any way or needs the splunk- pagerduty app to do it.

Thanks for your solution.

Amit_Sharma1 · ‎09-02-2025

Thanks livehybrid.

I have applied the changes and so far not received any alert. I will wait for the threshold to be breached and see if it clears the alert based on the lookup entry.

So far i tested it manually by decreasing the threshold it created a lookup entry with alert status changing from 1 to 0.

Thanks,
Amit

PrewinThomas · ‎08-31-2025

@Amit_Sharma1

Since Splunk doesn’t remember past alerts, it just fires based on current query results. If no results are returned, no alert is triggered.

Can you try with single scheduled alert, send a “trigger” (error rate ≥ 30%) and a “resolve” (error rate < 30%) to PagerDuty with the same dedup_key.

| eval status=if(ErrorRate>=30,"triggered","resolved")
| eval dedup_key="HighErrorRate"
| table status, dedup_key, ErrorRate

Regards,
Prewin
If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!

Amit_Sharma1 · ‎08-31-2025

Thanks @PrewinThomas for looking and replying.

With this solution it will keep on triggering the resolution alert after every 30 min window(based on my alert condition) as it is always below threshold.

My requirement is as below
I need an alert trigger as soon as error_rate crosses 30% and if it is below 30% clear the existing alert on pagerduty and this check should run every 30 min.

I do not need any alert(pagerduty notification) if it is below 30%(which means clear alert).

Thanks,
Amit

Problem resolution alert using splunk and send it via api

alert condition

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

Are you a member of the Splunk Community?

Problem resolution alert using splunk and send it via api

alert condition

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2