I am in need of the following requirement. Could anyone help me with the possible ideas?
I need to create an alert if splunk didn't start after 5 mins of stop.
ie If anyone stops the service and forgot to start in 5 mins, I want to trigger an alert.
I am able to find the stop and start logs in _audit logs as well as in splunkd.log. Which one will be reliable and how to calculate the gap and trigger an alert..
Thanks in advance
This is a broader, sysadmin/.service monitoring question. If Splunk is not running, then by definition it cannot create an alert, can it?
You don't specify what platform you are running on (Linux or Windows). If Linux, I would suggest you probably want to use an external service monitor (such as Nagios or Zabbix), which will continuously monitor according to criteria you set (in this case a TCP service listening on port 8089), and alert to different levels (e.g. warning when the service goes down, critical after 5 or more minutes of unavailability).