I have list of windows server in the inputlookup, if server downtime exceeds more than 15 minutes for any of the server then alert needs to be triggered.
earliest=-15m@m latest=now index=os**** [inputlookup ServerList_*** | where Environment="PROD" OR Environment="DR" | stats count by Host| return 100 host=Host] sourcetype="Perfmon:System" counter="System Up Time"
Can someone please help me to complete this query?
If I understand correctly, you have a list of hosts (via inputlookup) and you want to find those hosts which have not sent any uptime events in the last 15 minutes. Correct?
Try using a subsearch like this:
| inputlookup Server_List_***| where Environment="PROD" OR Environment="DR" | table host | search NOT [search index=... earliest=-15m latest=now sourcetype="Perfmon:System" counter="System Up Time" | table host]
If any of the hosts which presents in the inputlookup, not pinging/down for more than 15 minutes. Then alert needs to be sent. I have made query like this below. Please correct me if am wrong.
| metadata type=hosts index=*** | lookup ServerList*** Host as host OUTPUT Environment| search Environment="PROD" OR Environment="DR" | eval minsAgo = (now()-lastTime)/60 | search minsAgo>15 | rename totalCount as Count firstTime as "First Event" lastTime as "Last Event" recentTime as "Last Update" | fieldformat Count=tostring(Count, "commas") | fieldformat "First Event"=strftime('First Event', "%c") | fieldformat "Last Event"=strftime('Last Event', "%c") | fieldformat "Last Update"=strftime('Last Update', "%c") |fieldformat time=strftime(now(),"%c")|fields + host,Environment,"Last Event",minsAgo,time
I have list of those hosts in the inputlookup to monitor and need to send an alert for the below requirement.
server is not reachable on SNMP/PING more than 15minuts/900 sec.