I may have missed out somewhere but I'm wondering if anyone has a way to detect if splunkd is being shutdown by an admin on forwarders.
The concern is that if I have a forwarders installed on a Windows/Linux server,an admin with access to the servers may still be able to stop splunk services,tamper with things and then restart the services(deployment monitor only checks if there are events missing within certain time..)
So is there anyway we can detect this?Thanks in advanced
We can write a powershell script to monitor Windows- splunkd service.
This powershell script will monitor the service , and using SMTP server , we can trigger alert to mail ids , who support Splunk.
then immediate actions can be taken to start Splunk services back
Some updates,linux aside,I just did an upgrade for one of our old splunk forwarder(to universal forwarder 4.2.4) on windows server.I've noticed the following:
1) By searching the _internal index method,if the forwarder is shutdown,the indexer will not receive the shutdown event until the forwarder has restarted.(This way we probably will not be alerted when its down?)
2)By monitoring the windows system event log for forwarder shutdown events,when the service is shut down an event will be logged to windows event but the forwarder will not send this event to indexer.Even if the forwarder service has been restarted,the duration when the forwarder is down will not be captured at indexer.(I'm not sure why but seems that the older version of splunk able to do so)
i'm also trying to capture a stop command BEFORE splunk actually stops. the workaround I'm currently doing is editing the splunk script in init.d. i write something to file that the forwarder monitors before before the line that stops splunk. it somehow works but i'm still looking for a better way of implementing this without having to modify the default splunk files.
maybe look at this problem more from the sysadmin scope rather than splunk admin scope. read this post at the very end http://www.indigorose.com/forums/archive/index.php/t-30167.html on how to make a service unstoppable.
don't blame me if you break something 🙂
just tested that and if you forward your forwarders _internal (outputs.conf - forwardedindex) to the indexer, you can see a message like this:
07:25:46.174 AM 08-22-2011 05:25:46.174 +0200 INFO ShutdownHandler - shutting down level ...
if you search
index=_internal ShutDownHandler on your indexer. this way you would at least know someone stopped the forwarder.
Hi remy06 in this case build a watchdog script on your linux box, which checks if the process 'splunkd' is running and if not restart 'splunkd' and your set 😉
find an example here: http://blog.eracc.com/2010/05/08/linux-monitor-a-service-with-a-watchdog-script/
I've been using your suggestions and it works fine,so far if splunk was shut down using .../splunk start/stop on linux. However on linux the privileged user can also execute the "kill" command to stop the service.Wondering what other workarounds you guys have?
I've been trying to add a rule in auditd to do the work but it doesn't seem to work yet