Getting Data In

"ERROR: The mgmt port [8089] is already bound" prevents restarting Splunk


On a 4.1.2 Windows forwarder, we have a .path scripted input pointing to IBM WebSphere's wsadmin command-line shell. The wsadmin process launches another process (java.exe running a jython script that we passed into wsadmin). That java process ("grandchild" from Splunk's perspective) runs a custom Jython script we wrote, pipes the output of that script back to wsadmin ("child" from Splunk's perspective) which in turn outputs that text back to Splunk. So far so good.

Here's the problem: when we stop Splunk, the "child" process is correctly killed by Splunk, but the "grandchild" process lives on. I assume wsadmin is not launching the java process in a way which allows Splunk to kill the whole process tree. (Unfortunately, we can't change how wsadmin launches processes, since it's IBM's code.)

Making matters worse, the orphan java process hanging around prevents restarting Splunk! We get this error when trying to restart Splunk:

ERROR: The mgmt port [8089] is already bound. Splunk needs to use this port. Would you like to change ports? [y/n]:

If we manually kill the orphan java.exe process, the error above doesn't happen. We can hack around the issue by having the grandchild process commit suicide when splunkd exits. But that requires a separate thread and/or frequent calls to watch for splunk exiting-- and allows a race condition where Splunk fails to restart before the grandchild detects that splunkd is gone.

Any idea why I'm getting the error above, given that the process Splunk was previously talking to has been killed?

Tags (2)
0 Karma

Splunk Employee
Splunk Employee

So, splunkd.exe launches a Powershell script, which launches a JVM, which launches a Jython process. All 3 "launchees" inherit splunkd.exe's filehandles, which include the handle to socket Is how OSes work.

Since you're in hackland already, consider pskill from Microsoft's Sysinternals (to kill the orphan process). It doesn't come prepackaged with Windows Server 20XX, but it's officially supported by Microsoft.

Also, instead of child actively checking for parent being alive, the child could check for modtime of a tempfile being sufficiently recent (the parent periodically updates the file).

0 Karma
Get Updates on the Splunk Community!

Introducing Ingest Actions: Filter, Mask, Route, Repeat

WATCH NOW Ingest Actions (IA) is the best new way to easily filter, mask and route your data in Splunk® ...

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...