Splunk Search

splunkd died every day with the same error

vincenty
Explorer

splunkd died every day with the same error
FATAL ProcessRunner - Unexpected EOF from process runner child!
ERROR ProcessRunner - helper process seems to have died (child killed by signal 9: Killed)!

I can't see anything that may caused this... it does not last for 24 hours after restart...

here's the partial log:
04-13-2013 13:37:03.498 +0000 WARN FilesystemChangeWatcher - error getting attributes of path "/home/c9logs/c9logs/edgdc2/sdi_slce28vmf6011/.zfs/snapshot/.auto-1365364800/config/m_domains/tasdc2_domain/servers/AdminServer/adr": Permission denied
04-13-2013 13:37:03.499 +0000 WARN FilesystemChangeWatcher - error getting attributes of path "/home/c9logs/c9logs/edgdc2/sdi_slce28vmf6011/.zfs/snapshot/.auto-1365364800/config/m_domains/tasdc2_domain/servers/AdminServer/sysman": Permission denied
04-13-2013 13:38:37.102 +0000 FATAL ProcessRunner - Unexpected EOF from process runner child!
04-13-2013 13:38:37.325 +0000 ERROR ProcessRunner - helper process seems to have died (child killed by signal 9: Killed)!

Tags (1)

codebuilder
SplunkTrust
SplunkTrust

Your ulimits are not set correctly, or are using the system defaults.
As a result, splunkd is likely using more memory than allowed or available, so the kernel kills the process in order to protect itself.

----
An upvote would be appreciated and Accept Solution if it helps!

RishiMandal
Explorer

Did you get this resolved?
Can you validate and confirm if splunk was getting killed post an active session is terminated, that is, as soon as some one logs out of your splunk session or server, and if it dies after that.

0 Karma

realsplunk
Motivator

We had this problem with an infinite loop inside a macro (calling itself) even though we had [search] limits.conf set up on memory.

0 Karma

RishiMandal
Explorer

how did you find the macro causing issues and calling itslef. Will be helpful for me to validate the same

0 Karma

realsplunk
Motivator

Correlated with changes made that day

0 Karma

mweissha
Path Finder

My .02 is that this is memory related. I am having the same issue and a check on /var/log/messages shows:

Apr 20 01:59:06 splog1 kernel: Out of memory: Kill process 45929 (splunkd) score 17 or sacrifice child
Apr 20 01:59:06 splog1 kernel: Killed process 45934, UID 5000, (splunkd) total-vm:66104kB, anon-rss:1260kB, file-rss:4kB

This was happening on a new instance of Enterprise 6.5.3. I traced it to an input source that was particulary large and hadn't been indexed for a while due to the upgrade. I had to restart splunkd a few times on the indexer and now it's running well.

rsolutions
Path Finder

Was this ever resolved?

0 Karma

rvenkatesh25
Engager

Check syslog/dmesg to see if the kernel's oom_killer is getting invoked

Out of memory: Kill process 7575 (splunkd) score 201 or sacrifice child
Killed process 7576, UID 1000, (splunkd) total-vm:70232kB, anon-rss:392kB, file-rss:152kB

gkanapathy
Splunk Employee
Splunk Employee

Signal 9 is a KILL signal from an external process. It is likely that your OS has some kind of monitor or other setting on it that kills processes that do certain things. Perhaps your administrator is watching for memory usage, access to certain files, or other things. You should consult with your system admin to find out what they have put in place.

Get Updates on the Splunk Community!

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Take the 2021 Splunk Career Survey for $50 in Amazon Cash

Help us learn about how Splunk has impacted your career by taking the 2021 Splunk Career Survey. Last year’s ...

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...