I'm trying to create a statistics table for whether or not a given Linux service is running on a set of hosts. For example, for service "rhnsd" running on hosts "my-*" ...
Host | State |
my-db-1 | Running |
my-db-2 | Stopped |
my-web-1 | Running |
my-web-2 | Stopped |
I have the ps module enabled, so I can use that as a source/sourcetype, but not sure how to eloquently display all hosts and the state of the given service like I've illustrated above. Any help is greatly appreciated.
Hi @bsg273,
if the field containing the service name is extracted and called "service" (if you need help in field extraction, you can ask to the community sharing a sample of your logs), you have to create a lookup (called e.g. services.csv) containing the services to monitor, eventually associated to the host, something like this:
host service
host1 rhnsd
host1 service1
host2 rhnsd
host2 service2
then you have to run a search like the following:
index=os
| eval host=lower(host), service=lower(service)
| stats count BY host service
| append [ | inputlookup services.csv | eval host=lower(host), service=lower(service), count=0 | fields host service count ]
| stats sum(count) AS total BY host service
| rename host AS Host service AS Service
| eval Status=if(total=0,"Not Present","Present")
| table Host Service Status
Ciao.
Giuseppe
I'm not seeing where this uses ps or anything else to see if a given service/process is running. Could you please advise?
Hi @bsg273,
as @richgalloway said, this is a search on a Splunk index where linux logs are stored, but you have to run ps command on the machines and send these logs to Splunk.
To do this you have to install a Universal Forwarder on the target servers and deploy to them the Splunk Add-On for Unix and Linux (https://splunkbase.splunk.com/app/833/).
In this way, you'll have in an index (usually os) the results of the ps command to use for a search like the one I hinted.
Remember to enable the inputs that you need (in your case the ps) and insert the option for the index to use.
Ciao.
Giuseppe
ps runs on each server and sends data to Splunk at intervals. It is not run at search-time. The data usually is stored in the 'os' index. Giuseppe's query searches that index.
Finding something that is not there is not Splunk's strong suit. See this blog entry for a good write-up on it.
https://www.duanewaddle.com/proving-a-negative/