I have the Splunk Add-On for Linux and Unix installed which enables the "PS" command. How can I monitor the tomcat service or some other service on multiple hosts for 5 minutes?
This is what I have so far:
host="server-*" source="ps" tomcat
I would like to trigger an alarm whenever the tomcat service has been down for more than 5 minutes on any of the hosts that the query finds.
Try the following search:
host="server-*" source="ps" process_name="tomcat" | dedup host | eval lastseen=now()-_time
You might need to change process_name="tomcat" to suit your needs. You also might want to add "index=..." This will make your searches faster.
Save this search as an alert with the custom trigger condition: lastseen>300. The time range should be several hours, e.g. last 24 hours.
when you say "You also might want to add "index=...""....
This means setting up a monitor this way:
splunk add monitor /opt/tomcat/logs/catalina.out -index tomcat
Right?
and then do:
host="server-*" source="tomcat" process_name="tomcat" | dedup host | eval lastseen=now()-_time