Splunk Enterprise

How to create Alert trigger for only when there is consistent failure rates

shashank_24
Path Finder

Hi, I have a requirement where I want to create an alert on some of my APIs which are being monitored in Splunk.

I've created a search which checks the success/failures of each API and then calculates the failure rate and if that is more than 10% then it triggers the alert.
Now what is happening is the alerts gets triggered even for bigger blips when they are only for short duration. Like there is a high increase in error rate for 5 mins and then it gets recovered itself. I don't want to trigger the alert in that situation because it will make unnecessary callouts to people for investigation which is not required.

How can i create alert which runs every 30 mins and looks into the failure rate consistently for each 5 mins in the last 30 minutes period. So if the failure rate is consistent for more than 15/20 mins then only trigger the alert.

This is my base search

 

 

index=api_prod (message.httpResponseCode=50* OR message.httpResponseCode=20*)
| rename message.serviceName as serviceName message.httpResponseCode as httpResponseCode 
| stats count as totalrequests count(eval(like(httpResponseCode, "20%"))) as successrequest count(eval(like(httpResponseCode, "50%"))) as failedrequest by serviceName 
| eval Total = successrequest + failedrequest 
| eval failureRatePercentage = round(((failedrequest/totalrequests) * 100),2) 
| where failureRatePercentage > 10
| fields - Total
|table serviceName,totalrequests,successrequest,failedrequest,failureRatePercentage

 

 

Any guidance is really appreciated.

Best Regards,
Shashank

Labels (2)
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...