When I issue 'splunk status' on Linux, the exit code is 0 even when splunk is not running. This makes it hard to use it in automation - Chef in my case, which relies on the return code to tell it whether the service is up/down. Is there other good way to do this?
If you use,
splunk status splunkd
The return code is 0 when daemon is running and 3 for not running. But if you use,
splunk status splunkweb
The return code is always 0 no matter web service is running or not. I think it is a bug.
I think this got fixed in the intervening years, but in any event it is no longer relevant, as splunkweb is no longer a separate service.
Hello, this seems to be a bug. Splunk exit code is clearly not LSB conform. At least on Linux. We experience the same problem.
http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
Yes indeed. This is why splunk status changed to returning 3 when the splunkd service is down.
ianformanek,
Building on your answer, would it not be better to a be a little more explicit and use
To check just splunkweb...
... $SPLUNK_HOME/bin/splunk status splunkweb | grep 'is not running' | ...
To check just splunkd (not including helpers)
... $SPLUNK_HOME/bin/splunk status splunkd | grep splunkd| grep 'is not running' | ...
You could also check helpers are running (used for running scripts, etc.)
... $SPLUNK_HOME/bin/splunk status splunkd | grep 'splunk helpers' | grep 'are running'...
Doing this could prove more reliable. As one service could crash without stopping the other (or someone could kill one process but not the other).
Indeed, depending on the usecase, this may be better. In my case, I want to check if all the services are running to determine if splunk service start
needs to be called (that would only start those not running) - hence the check for any ocurrence of 'is not running'.
Here is the best approach I ended up to simulate returning non-0 exit code when one of the splunk services is not running:
expr `service splunk status | grep 'is not running' | wc -l` == 0
I am not sure why 'splunk status' command does not work in your environment. But the other way to monitor splunk process will be "ps" command may be useful.
A normal default Splunk will start up:
・" Two "splunkd" (or "splunkd.exe") processes.
One does indexing, and the other helps launch other processes as necessary
・ " SplunkWeb, which runs inside of "python" (or "pythonservice.exe")
So, you can use following command to monitor those processes.
ps aux | grep splunk | grep -v grep
Thanks Takajian, yes that works. I wanted to avoid doing it this way though, as it is error prone (just naming the server 'my-splunk-server' can easily make the ps result always appear as if splunk is running).
Just to clarify - 'splunk status' does work correctly and reports status for both processes, just not via the exit code. Below is a way I ended up doing it.