I would like to monitor the status of a linux-based splunkd configured as a heavy forwarder from an external system (nagios, custom scripts, etc). The only visibility I'm aware of regarding the state of the daemon from the linux cli is 'splunk status splunkd' which just tells me if the daemon is running or not. Is there any way for splunkd to report, fer instance, how long its been running and the number of connections accepted in the last minutes ?
There are multiple ways you can do
1. Looking into the logs. We make use of "metrics.log" and use "tcpin_queue" to measure the data transmission. There are multiple components in this log which you can make use of. This would update frequently and based on your logic ensure nagios monitor it every 10-15mins and if there is no update then there is a problem.
2. REST api calls - you need to have a API enabled user within Splunk for this. Based on your endpoint you fire your REST call from your monitoring system especially the introspection endpoint. This contains all information for any monitoring. Just go into the level of detail you require. (eg: curl -k -u admin:changeme https://localhost:8089/services/server/status/resource-usage/splunk-processes)