Monitoring Splunk

splunkd PID going in "defuct" after restart.

splunkuseradmin
Path Finder

Hi all,

Need in some issue which is might be a known one, please help.
every frequently one or the other indexer(splunkd) service is going in "defunct" state then when i try to restart the service all PID's related to splunk going in state so cannot restart the service aswell.

Initially, it was showing below "defunct" then i tried to restart splunk.service
[root@hostname ~]# ps -ef | grep defunct
root 23399 23167 0 21:49 pts/1 00:00:00 grep --color=auto defunct
svc.spl+ 32079 4383 0 09:07 ? 00:00:03 [splunkd]

[root@hostname ~]#

I tried to restart and the status is below.

[root@hostname ~]# systemctl status splunk.service
● splunk.service - Splunk
Loaded: loaded (/usr/lib/systemd/system/splunk.service; enabled; vendor preset: disabled)
Active: activating (start) since Sun 2020-02-02 22:45:54 MST; 1min 9s ago
Process: 29434 ExecStop=/opt/splunk/bin/splunk stop (code=killed, signal=TERM)
Main PID: 4381; : 32121 (splunk)
CGroup: /system.slice/splunk.service
└─32121 /opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license
‣ 4381 [splunkd]

Feb 02 22:45:54 hostnme.domain.com systemd[1]: Starting Splunk...
Feb 02 22:45:54 hostnme.domain.com splunk[32121]: splunkd 4381 was not running.
Feb 02 22:45:54 hostnme.domain.com splunk[32121]: Stopping splunk helpers...

[root@hostname ~]# ps -ef | grep splunk

svc.spl+ 4381 1 99 Jan30 ? 5-12:21:04 [splunkd]

svc.spl+ 7638 1 0 09:05 ? 00:00:37 [splunkd]

svc.spl+ 7639 7638 0 09:05 ? 00:00:00 [splunkd]

svc.spl+ 26167 1 0 09:06 ? 00:00:14 [splunkd]

svc.spl+ 26168 26167 0 09:06 ? 00:00:00 [splunkd]

svc.spl+ 32079 1 0 09:07 ? 00:00:03 [splunkd]

svc.spl+ 32080 32079 0 09:07 ? 00:00:00 [splunkd]

svc.spl+ 32121 1 0 22:45 ? 00:00:00 /opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license

root 32281 23128 0 22:47 pts/1 00:00:00 grep --color=auto splunk

svc.spl+ 36590 1 0 09:08 ? 00:00:03 [splunkd]

svc.spl+ 36597 36590 0 09:08 ? 00:00:00 [splunkd]

svc.spl+ 36744 1 0 09:08 ? 00:00:08 [splunkd]

svc.spl+ 36746 36744 0 09:08 ? 00:00:00 [splunkd]

svc.spl+ 45222 1 0 09:10 ? 00:00:01 [splunkd]

svc.spl+ 45224 45222 0 09:10 ? 00:00:00 [splunkd]

any help would be appreciated.

Thankyou.

Labels (1)
0 Karma

codebuilder
SplunkTrust
SplunkTrust

This happens when you initially install/start Splunk as root, but then change the owner to "splunk" e.g.
Stop Splunk gracefully, systemctl splunk stop.
Check for any remaining processes (ps -ef |grep -i splunk), and kill any that remain (kill -9 e.g.).
Ensure that Splunk is set to run as the user you intend, check /opt/splunk/etc/splunk-launch.conf.
Restart Splunk and verify. Note: if using systemd, update also the user in the unit file (/etc/systemd/system/splunkd.service, or whatever you had named it).

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Have you looked at dmesg or /var/log/messages ? It looks like splunk process killed but why it is killed that you need to check in OS logs (For example: OOM killer)

0 Karma

splunkuseradmin
Path Finder

i see some hung_task_kernel messages which "Splunkd : is blocked for more than 120 secs".

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...