Deployment Architecture

Splunk Search Head hangs and must be restarted

ttrolf
Explorer

We are seeing occurring quite often now where the Splunk Search Head simply stops responding. We try running splunk stop or restart and neither works. When we grep the process we see this job never quits:

python -O /opt/splunk/lib/python2.6/site-packages/splunk/appserver/mrsparkle/root

We must kill this process with kill -9 then run splunk start. Then all is OK again. We are running Splunk 4.2.1 x86_64 on RHEL 5 64 bit.

Any idea what might be causing this? I have not seen anything interesting in splunkd.log indicating any ERRORS relating to this.

tmeader
Contributor

I'm not sure if this is related or not, but we've seen (about 3 times in the past 6 months, so all version 4.2.x) several instances where our search head box has essentially hung completely. If we were fortunate enough to be logged into the box at the time via SSH, then a recovery is possible by stopping Apache and Splunk completely and restarting them. If not, we aren't able to SSH into the box, or even bring up the console via KVM. Manual physical reset of the box is the only recourse. To the original poster: you aren't by chance running single sign-on at all are you? We have Splunk running behind a local Apache proxy using RSA Securid for auth, hence needing to restart Splunk AND Apache. We can never identify anything out of the ordinary after the reboot either, other than a segfault in the securid module ~20 minutes before complete unresponsiveness of the box.

ttrolf
Explorer

We did not have any hardware issues or cpu / memory problems on the server just that the process I mentioned hanged and led to the problems I described. Splunk is still running but cannot be contacted. And it seems to occur on our newer search head running 4.3

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...