Hello All,
Splunk Enterprise version 8.1
Post a recent server crash, our Splunk instance isn't coming up. The splunk service isn’t’ starting despite us having gracefully rebooted the server once. The error is it thinks TCP port 8089 is already occupied by splunk itself despite splunk service not running. Pls see below output. Even if I force kill the process ID related to 8089/TCP, the system automatically spawns a new process ID and shows 8089 as occupied yet again by splunkd. This is going in an endless loop. There is nothing in splunkd.log file to indicate this weird behavior.
What is making splunk launch a new process automatically despite us force killing the PID ?
I have tried https://community.splunk.com/t5/Deployment-Architecture/How-to-resolve-error-quot-ERROR-The-mgmt-por... but no luck. As mentioned we even restarted the host.
[svc-splunk@hostname bin]$ ./splunk status
splunkd is not running.
[svc-splunk@hostname bin]$ ./splunk start
Splunk> The Notorious B.I.G. D.A.T.A.
Checking prerequisites...
Checking http port [8000]: open
Checking mgmt port [8089]: not available
ERROR: mgmt port [8089] - port is already bound. Splunk needs to use this port.
root@hostname bin]# netstat -tulpn | grep 8089
tcp 0 0 0.0.0.0:8089 0.0.0.0:* LISTEN 7523/splunkd
[root@hostname bin]# kill -9 7523
[root@hostname bin]# netstat -tulpn | grep 8089
tcp 0 0 0.0.0.0:8089 0.0.0.0:* LISTEN 7979/splunkd
[root@hostname bin]# kill -9 7979
[root@hostname bin]# netstat -tulpn | grep 8089
tcp 0 0 0.0.0.0:8089 0.0.0.0:* LISTEN 8452/splunkd
[svc-splunk@hostname bin]$ ./splunk status
splunkd is not running.
Any suggestions ? If reinstall the only option, then pls suggest how to take backup of this Deployment Server and restore. This is a Deployment server with over 500+ clients phoning home to it
Thanks