Monitoring Splunk

What's the best way to get the list of forwarders where splunkd service has stopped running?



I need to find out list of all the servers where splunkd service is not running which were running before. I have more than 9000 forwarders and have three scenarios which are listed below:

  1. splunkd is not running.
  2. splunkd is running and deployment client is set but indexer configurations are not done.
  3. splunkd is running and indexer configurations are done but deployment client is not set.

Because of the above limitations, I am finding it difficult to use queries which are based on phone home or internal logs received in Splunk as its showing up incorrect server list.

Also, I'm not allowed to use script to monitor the splunkd service on each hosts as it requires remote login.

Currently I'm using internal logs to find out the up and down forwarders but looking for a better solution.

Thank You.


You can use this to get the last connected time and set a threshold to 60 seconds or more based on your configuration.

index=_internal |bucket _time span=1m | eval timenow=now() | convert timeformat="%b %d, %Y %H:%M:%S" mktime(_time) as LastInfoEvent mktime(timenow) AS currentTime| eval secondsSinceLastKeepAlive=(currentTime-_time) | stats min(secondsSinceLastKeepAlive) as secondsDead by host| sort secondsDead DESC

0 Karma


Adding to @dantimola

You can use the below three ideas and tweak it to your requirements :

No internal logs generated for UFs, you can generate an hourly search and identify the time duration since the last data was forwarded. You can use tstats command to reduce search processing

Internal Logs for Splunk can be checked and correlated with TCPOutput to see if it is failing

Internal Logs for Splunk and correlate with connections being phoned in with the DS. A UF should communicate with DS everytime a DS is restarted (this is the default parameter)

Hope you also have an asset database that would make it easier to correlate and reach out to end server admins.

0 Karma


Have you tried checking via Deployment Server? You can check the status of your universal forwarder in Deployment Server's Forwarder Management, have you also tried the query below?

| metadata type=hosts index=<index name> | convert ctime(*Time)


0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...