Alerting

How to create alert if no up alert received within 5 minutes?

adam_dixon95
Explorer

Hi,

I am currently using Splunk for SNMP Up/Down traps for interfaces.

We are currently alerting for each Up/Down alert that comes in via a log file and it's getting quite messy, as quite often an Up alert will come in as soon as the Down alert has triggered - creating many false-positives.

I'm looking for a method the would simulate the following:

If a linkDown event is received and a linkUp for the same device within 5 minutes = Do not alert
If a linkDown event is received and no linkUp event is received within 5 mintues = send alert.

Labels (1)
0 Karma

Mohanveera1
Explorer

Hello @richgalloway 

As i am facing the same scenario and tried to use the above mentioned query but i am not getting any results... In the query the host_name represents the devices. The query what i used is

index=th Stage="UP" OR "DOWN" | stats latest(_time) as _time, latest(Stage) as lastStatus by host_name
| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")

But If i tried the below query i am getting the only DOWN status devices and if the hosts are up also they are displaying the Down status only..

index=th Stage="DOWN" | stats latest(_time) as _time, latest(Stage) as lastStatus by host_name
| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")


Thanks in Advance....

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The first query is looking for events where Stage is "UP" or the word "DOWN" is anywhere in the event.  That may be affecting the results.

index=th Stage="UP" OR Stage="DOWN" 
| stats latest(_time) as _time, latest(Stage) as lastStatus by host_name
| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")

The second query looks only for DOWN events so, naturally, it will only display DOWN events.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Mohanveera1
Explorer

Hello @richgalloway 

Thank you for the response

I have also tried the below mentioned query as-well but can't get the results...

index=th | stats latest(_time) as _time, latest(Stage) as lastStatus by host_name| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")

So, can you please help me to create an alert for this scenario...

 

Thanks and regards,
Mohan

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this alternative query.

index=th (Stage="UP" OR Stage="DOWN" )
| dedup host_name
| where (lastStatus="DOWN" AND _time<relative_time(now(),"-30m"))

If you still do not get the desired results, then run the query one command at a time until the results become undesirable.  The latest command is the one we would need to focus on.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Search for both linkDown and linkUp events. If the most recent event is linkDown and it was at least 5 minutes ago, trigger an alert.

<search> | stats latest(_time) as _time, latest(status) as lastStatus by interface 
| where lastStatus=linkDown AND _time<relative_time(now(), "-5m")
---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...