We have multiple aa-dev-server that are running jboss, below query sends me alert when jboss service is down. The issue is, I have a limitation on my splunk account where i am limited to only few real time alert. Is there a query where if i could use the "" it picks all host="aa-dev-jboss-" but send email and specifies which host jboss server went down or provide a table which server did
host="aa-dev-jboss-1" source=ps jboss | stats latest(_time) as latest by host
@shakeel253, you need not run a real-time alert as the same can be based on SLA defined at Enterprise level.
For example you can run the alert every 5 minutes for last 15 minutes to check host ping status. Since you seem to base your query on
host, you can use use
metadata command to write faster search for the same. Also, for the query above if a host has no events at all for the time period you are searching then the host will not be reported. So you would need to have a
lookup file in Splunk with all available host names. You can have static lookup file or have a scheduled Splunk search with
outputlookup command to write available hosts to lookup file. PS: You can also use Splunk REST service call to get a list of all hosts which are pinging your Splunk Server. Assuming your host lookup file is
host field name, you can try a query like the following:
| tstats latest(_time) as _time WHERE index=<yourIndexName> BY host | eval "downTime (in Min)"=round((now()-_time)/60,0) | appendpipe [ | inputlookup available_jboss_hosts.csv | fields host | eval "downTime (in Min)"="999" ] | dedup host | where 'downTime (in Min)'>5
hey Niketniley, looks the the query will do the job, but i am getting below mentioned error. I tried putting a space between search "downTime (in Min)">5 but its not helping
Comparator '>' is missing a term on the right hand side
The search job has failed due to an error. You may be able view the job in the Job Inspector.
@shakeel253, my bad I had used
search in place of
where. Replaced double quotes with single quotes as well. Please try again and confirm.
@shakeel253, I based it off your query which you mentioned in the question that works fine for individual JBOSS server ->
below query sends me alert when jboss service is down
Nevertheless. Which OS is the JBOSS server running on Windows or Linux. Usually on windows JBOSS service start and stop is logged in EventViewer, is it so? if not do you have explicit JBOSS logs that can be used instead?
If you can provide the logs or place which you use to identify JBOSS service down, the same can be plugged in to the alert.
First let me change answer to comment, because seems like your query in the question does not seem to do what you expect.
The OS we are using is Linux (Amazon Linux EC2 instance or Redhat). We use Jboss server log to identify if its starting or shutdown, the absolute path for that is /opt/jboss-eap/standalone/log/server.log.
Another way we check if Jboss service is running by checking Jboss pid
ps aux | grep jboss
Then instead of WHERE
sourcetype for JBOSS if you have kept one. Otherwise use
two issues when i changed the source and ran the query
1) It is picking jboss server for other environment, for example, i need jboss for ABC environment but not for DEF environment, but its picking up all the server from ABC environment and DEF environment.
2)The second issue is that the query should only give me result when it detects is the jboss server is down, but it is still showing me result