Hi,
We need to create an alert to check if tomcat is up and running. This we could identify using pid.
If tomcat is up and running, then we would receive tomcat events (logs) with pid.
If tomcat is down, we wont receive any event.
Can you please help us with the search for this?
Base query below, but this does not work because whenever Tomcat is down, we won't get any events. We may need to use something like fillnull?
index=idne1 source=ps host=host1 OR host=host2* apache-tomcat
|stats count(pid) as Count by host _time
|where count <1
Try this
index=idne1 source=ps host=host1 OR host=host2* apache-tomcat pid="*" earliest=-10m | stats latest(_time) as latest_event | where latest_event<relative_time(now(), "-5m@m")
If you want to do this sort of thing in Splunk, you might be better off trying something like this:
| tstats latest(_time) as lastTime by host
| search host=host1 OR host=host2*
| eval age=now()-lastTime
| search age > 120
| sort age d
| convert ctime(lastTime)
| fields age,host,lastTime
| sort host
This will tell you if those hosts have sent no events to any index within the past 120 seconds.
If I assume that every running Tomcat host will have at least 1 log every 5 minutes and that you need to be alerted within an hour if anything goes down, then try this running the every hour over last 24 hours:
index=idne1 source=ps host=host1 OR host=host2* apache-tomcat pid="*"
| dedup host
| search _time > (5*60)
We are actually taking these from Source =ps, where we ger 1 log for every 30 secs
So just change it to fit.