Alerting

How to : Schedule an Alert everytime a job fails more than once within an hour

christinaef07
Loves-to-Learn Everything

Hello, I am trying to create an Alert on Splunk. I want to create an alert so that I am alerted every time a job fails 2 times or more within an hour. We have several different jobs running. Right now, I have a table displaying each job with the amount of failures of each. 

 

 

index=?? uuid=* |search status=success | rex "message=(?<message>.*)" | stats count(eval(status=="failed")) AS Failures by workflow_name | table workflow_name, Failures

 

 

This displays something like : 

workflow_name        Failures

workflow_1                 3

workflow_2                 1

workflow_3                7

How can I fix this to filter and only include the workflows that have failed more than once (workflow_1 & workflow_3) and within a specific time frame - 1 hr.  Additionally, I want to pull in info about the specific workflow with the latest failure (for ex: message, uuid, etc). For ex:

 

workflow_name        Failures.       Latest message       Latest uuid 

workflow_1                 3                        error msg                    12345678

workflow_3                7                          error msg                  98765432

 

Labels (3)
0 Karma

aohls
Contributor

A where clause at the end of you query should do it; | where Failures > 1. Then you could schedule the job to run on whatever time frame you need.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...