Alerting

Down Server Interface Alert

Timmac
New Member

Hey guys,
Trying to set up an alert that will send an email when an interface goes down but does not come up within a certain timeframe. I'm assuming 10-15 minutes should suffice. We're having an issue when running updates or rebooting a server, the interface does not come up properly sometimes.

This was a test run of what the logs would look like searching for USPK10OLLBS01 and /Common/tcp:

I'm pretty bad with the searching logic so I could really use some help! Thanks much, these are the logs i'm working with below during a test down state on one of the interfaces. It reports a couple ups, but only one down.

2/3/15
10:29:00.000 AM
Feb 3 10:29:00 10.10.0.19 Feb 3 10:29:06 uspk10ollbs01 notice mcpd[6642]: 01070727:5: Pool /Common/UAT-BTS-Batch member /Common/USPK10OLBTSBA02:80 monitor status up. [ /Common/tcp: up ] [ was node down for 0hr:0min:3sec ]
host = 10.10.0.19 source = udp:514 sourcetype = syslog
2/3/15
10:28:57.000 AM
Feb 3 10:28:57 10.10.0.19 Feb 3 10:29:03 uspk10ollbs01 notice mcpd[6642]: 01070638:5: Pool /Common/UAT-BTS-Batch member /Common/USPK10OLBTSBA02:80 monitor status node down. [ /Common/tcp: up ] [ was down for 0hr:0min:16sec ]
host = 10.10.0.19 source = udp:514 sourcetype = syslog
2/3/15
10:28:41.000 AM
Feb 3 10:28:41 10.10.0.19 Feb 3 10:28:47 uspk10ollbs01 notice mcpd[6642]: 01070638:5: Pool /Common/UAT-BTS-Batch member /Common/USPK10OLBTSBA02:80 monitor status down. [ /Common/tcp: down ] [ was up for 856hrs:4mins:2sec ]

Tags (2)
0 Karma

somesoni2
Revered Legend

Try something like this (assuming host name is NOT extracted. Remove the first regex for host if its extracted)

your base search  | rex "(?<HostName>\w+)\snotice.*was down for (?<hour>\d+)hrs\:(?<minute>\d+)mins\:(?<second>\d+)sec\s*\]" | eval Downtime=round((hour*3600 + minute*60 + second)/60,2)  | where Downtime>15

You can schedule this search and setup alert.
http://docs.splunk.com/Documentation/Splunk/6.2.1/Alert/Setupalertactions#Configure_email_notificati...

0 Karma
Get Updates on the Splunk Community!

App Platform's 2025 Year in Review: A Year of Innovation, Growth, and Community

As we step into 2026, it’s the perfect moment to reflect on what an extraordinary year 2025 was for the Splunk ...

Operationalizing Entity Risk Score with Enterprise Security 8.3+

Overview Enterprise Security 8.3 introduces a powerful new feature called “Entity Risk Scoring” (ERS) for ...

Unlock Database Monitoring with Splunk Observability Cloud

  In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and ...