Alerting

How to create alert if no up alert received within 5 minutes?

adam_dixon95
Explorer

Hi,

I am currently using Splunk for SNMP Up/Down traps for interfaces.

We are currently alerting for each Up/Down alert that comes in via a log file and it's getting quite messy, as quite often an Up alert will come in as soon as the Down alert has triggered - creating many false-positives.

I'm looking for a method the would simulate the following:

If a linkDown event is received and a linkUp for the same device within 5 minutes = Do not alert
If a linkDown event is received and no linkUp event is received within 5 mintues = send alert.

Labels (1)
0 Karma

Mohanveera1
Explorer

Hello @richgalloway 

As i am facing the same scenario and tried to use the above mentioned query but i am not getting any results... In the query the host_name represents the devices. The query what i used is

index=th Stage="UP" OR "DOWN" | stats latest(_time) as _time, latest(Stage) as lastStatus by host_name
| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")

But If i tried the below query i am getting the only DOWN status devices and if the hosts are up also they are displaying the Down status only..

index=th Stage="DOWN" | stats latest(_time) as _time, latest(Stage) as lastStatus by host_name
| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")


Thanks in Advance....

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The first query is looking for events where Stage is "UP" or the word "DOWN" is anywhere in the event.  That may be affecting the results.

index=th Stage="UP" OR Stage="DOWN" 
| stats latest(_time) as _time, latest(Stage) as lastStatus by host_name
| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")

The second query looks only for DOWN events so, naturally, it will only display DOWN events.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Mohanveera1
Explorer

Hello @richgalloway 

Thank you for the response

I have also tried the below mentioned query as-well but can't get the results...

index=th | stats latest(_time) as _time, latest(Stage) as lastStatus by host_name| where lastStatus="DOWN" AND _time<relative_time(now(),"-30m")

So, can you please help me to create an alert for this scenario...

 

Thanks and regards,
Mohan

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this alternative query.

index=th (Stage="UP" OR Stage="DOWN" )
| dedup host_name
| where (lastStatus="DOWN" AND _time<relative_time(now(),"-30m"))

If you still do not get the desired results, then run the query one command at a time until the results become undesirable.  The latest command is the one we would need to focus on.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Search for both linkDown and linkUp events. If the most recent event is linkDown and it was at least 5 minutes ago, trigger an alert.

<search> | stats latest(_time) as _time, latest(status) as lastStatus by interface 
| where lastStatus=linkDown AND _time<relative_time(now(), "-5m")
---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...