Monitoring Splunk

splunkd PID going in "defuct" after restart.

splunkuseradmin
Path Finder

Hi all,

Need in some issue which is might be a known one, please help.
every frequently one or the other indexer(splunkd) service is going in "defunct" state then when i try to restart the service all PID's related to splunk going in state so cannot restart the service aswell.

Initially, it was showing below "defunct" then i tried to restart splunk.service
[root@hostname ~]# ps -ef | grep defunct
root 23399 23167 0 21:49 pts/1 00:00:00 grep --color=auto defunct
svc.spl+ 32079 4383 0 09:07 ? 00:00:03 [splunkd]

[root@hostname ~]#

I tried to restart and the status is below.

[root@hostname ~]# systemctl status splunk.service
● splunk.service - Splunk
Loaded: loaded (/usr/lib/systemd/system/splunk.service; enabled; vendor preset: disabled)
Active: activating (start) since Sun 2020-02-02 22:45:54 MST; 1min 9s ago
Process: 29434 ExecStop=/opt/splunk/bin/splunk stop (code=killed, signal=TERM)
Main PID: 4381; : 32121 (splunk)
CGroup: /system.slice/splunk.service
└─32121 /opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license
‣ 4381 [splunkd]

Feb 02 22:45:54 hostnme.domain.com systemd[1]: Starting Splunk...
Feb 02 22:45:54 hostnme.domain.com splunk[32121]: splunkd 4381 was not running.
Feb 02 22:45:54 hostnme.domain.com splunk[32121]: Stopping splunk helpers...

[root@hostname ~]# ps -ef | grep splunk

svc.spl+ 4381 1 99 Jan30 ? 5-12:21:04 [splunkd]

svc.spl+ 7638 1 0 09:05 ? 00:00:37 [splunkd]

svc.spl+ 7639 7638 0 09:05 ? 00:00:00 [splunkd]

svc.spl+ 26167 1 0 09:06 ? 00:00:14 [splunkd]

svc.spl+ 26168 26167 0 09:06 ? 00:00:00 [splunkd]

svc.spl+ 32079 1 0 09:07 ? 00:00:03 [splunkd]

svc.spl+ 32080 32079 0 09:07 ? 00:00:00 [splunkd]

svc.spl+ 32121 1 0 22:45 ? 00:00:00 /opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license

root 32281 23128 0 22:47 pts/1 00:00:00 grep --color=auto splunk

svc.spl+ 36590 1 0 09:08 ? 00:00:03 [splunkd]

svc.spl+ 36597 36590 0 09:08 ? 00:00:00 [splunkd]

svc.spl+ 36744 1 0 09:08 ? 00:00:08 [splunkd]

svc.spl+ 36746 36744 0 09:08 ? 00:00:00 [splunkd]

svc.spl+ 45222 1 0 09:10 ? 00:00:01 [splunkd]

svc.spl+ 45224 45222 0 09:10 ? 00:00:00 [splunkd]

any help would be appreciated.

Thankyou.

Labels (1)
0 Karma

codebuilder
Influencer

This happens when you initially install/start Splunk as root, but then change the owner to "splunk" e.g.
Stop Splunk gracefully, systemctl splunk stop.
Check for any remaining processes (ps -ef |grep -i splunk), and kill any that remain (kill -9 e.g.).
Ensure that Splunk is set to run as the user you intend, check /opt/splunk/etc/splunk-launch.conf.
Restart Splunk and verify. Note: if using systemd, update also the user in the unit file (/etc/systemd/system/splunkd.service, or whatever you had named it).

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

harsmarvania57
Ultra Champion

Have you looked at dmesg or /var/log/messages ? It looks like splunk process killed but why it is killed that you need to check in OS logs (For example: OOM killer)

0 Karma

splunkuseradmin
Path Finder

i see some hung_task_kernel messages which "Splunkd : is blocked for more than 120 secs".

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...