Deployment Architecture
Highlighted

Splunk alert for server downtime , if downtime exceeds more than 15 minutes

Explorer

I have list of windows server in the inputlookup, if server downtime exceeds more than 15 minutes for any of the server then alert needs to be triggered.

earliest=-15m@m latest=now index=os**** [inputlookup ServerList_*** | where Environment="PROD" OR Environment="DR" | stats count by Host| return 100 host=Host] sourcetype="Perfmon:System" counter="System Up Time"

Can someone please help me to complete this query?

Tags (3)
0 Karma
Highlighted

Re: Splunk alert for server downtime , if downtime exceeds more than 15 minutes

Motivator

Hello @saravanafd,

If I understand correctly, you have a list of hosts (via inputlookup) and you want to find those hosts which have not sent any uptime events in the last 15 minutes. Correct?

Try using a subsearch like this:

| inputlookup Server_List_***| where Environment="PROD" OR Environment="DR" | table host
| search NOT [search index=... earliest=-15m latest=now sourcetype="Perfmon:System" counter="System Up Time" | table host]
0 Karma
Highlighted

Re: Splunk alert for server downtime , if downtime exceeds more than 15 minutes

Explorer

If any of the hosts which presents in the inputlookup, not pinging/down for more than 15 minutes. Then alert needs to be sent. I have made query like this below. Please correct me if am wrong.

| metadata type=hosts index=*** | lookup ServerList*** Host as host OUTPUT Environment| search Environment="PROD" OR Environment="DR" | eval minsAgo = (now()-lastTime)/60 | search minsAgo>15 | rename totalCount as Count firstTime as "First Event" lastTime as "Last Event" recentTime as "Last Update" | fieldformat Count=tostring(Count, "commas") | fieldformat "First Event"=strftime('First Event', "%c") | fieldformat "Last Event"=strftime('Last Event', "%c") | fieldformat "Last Update"=strftime('Last Update', "%c") |fieldformat time=strftime(now(),"%c")|fields + host,Environment,"Last Event",minsAgo,time

Highlighted

Re: Splunk alert for server downtime , if downtime exceeds more than 15 minutes

Explorer

I have list of those hosts in the inputlookup to monitor and need to send an alert for the below requirement.

server is not reachable on SNMP/PING more than 15minuts/900 sec.

0 Karma