Splunk Search

How to write a search to find servers that are not reporting?

Motivator

On patch night some of my splunk servers are not starting.
I can see the ones that are starting with this search

host=*mysplunkservers* index=sos  sourcetype=ps  process="splunkd *-p_8089_*start" | stats  count by host process

This will not tell me which ones are not running

If i had a search that looked for all the servers and filled in null for the ones that are not reporting I could run an alert and send out a pager notification.

How can I write the search to find the servers that are not reporting?

Tags (3)
0 Karma

Splunk Employee
Splunk Employee

Have you looked at the platform alerts in the distributed management console? There is an alert there for when an indexer is stopped. See Platform alerts in the Admin Manual for more information.

0 Karma

Motivator

I'd suggest either maintaining a lookup table with the full list of servers and doing a join to it, or you can extend your search by a few hours / days and instead of stats count ... do stats latest(_time) as last_event_time ...so you get to see when the latest event for that server came in. You can then sort by the time and see which are the oldest servers (and whether they are from before your patch).