Getting Data In

Modular Input Scripts Don't Die during Splunk restart

alacercogitatus
SplunkTrust
SplunkTrust

I have written two Modular Inputs for Splunk. Both exhibit the same behavior.

Steps to reproduce:

  1. Issue "splunk restart"
  2. ps -ef | grep python

Each Modular Input Data Input python script is orphaned after the restart, and when Splunk starts back up, it instantiates a new python process for each Data Input. Very quickly causes the box to become unresponsive, especially during dev work. I have not noticed this behavior on Windows or Mac OS X.

Linux: Ubuntu Linux #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Example:

ps -ef yields:
root 28238 28237 0 19:36 ? 00:00:01 python /opt/splunkbeta/etc/apps/GoogleApps/bin/googleapps.py

splunk stop

ps -ef yields:
root 28238 1 0 19:36 ? 00:00:01 python /opt/splunkbeta/etc/apps/GoogleApps/bin/googleapps.py

Am I missing something or is this a bug?

Tags (3)
0 Karma
1 Solution

Damien_Dallimor
Ultra Champion

I also experienced this and have raised the issue.
In the meantime, you can look at how I implemented a workaround for my JMS Messaging Modular Input :
https://github.com/damiendallimore/SplunkModularInputsJavaFramework/tree/master/jms

Basically the mod input script (jms.py) writes a PID file that gets checked upon startup.
Also, the Java program that the jms.py script executes has some simple logic to check whether Splunk is still up , and if not , kills itself.

This ensures that for "splunk start|restart" there will only be 1 mod input process running , and for "splunk stop" there will be zero mod input processes running.

View solution in original post

kschon_splunk
Splunk Employee
Splunk Employee

Another suggestion: The standard out and standard err of a scripted or modular input is piped back to Splunk, so when Splunk shuts down, the pipes are broken. So in Java, you can use:

boolean isSplunkRunning() {
    return !System.out.checkError();
}

This will return false the first time you write an event after Splunk shuts down. This is simpler than recording the PID and works on any OS.

0 Karma

Damien_Dallimor
Ultra Champion

If you use STDOUT. Many of my offerings allow you to ignore STDOUT (it performs poorly) and use HEC as the output "pipe" to Splunk.

0 Karma

kschon_splunk
Splunk Employee
Splunk Employee

Fair enough. A lot of folks will periodically write whitespace chars to std out to test the pipe.

BTW, the github link above gives me a 404. Does it still exist someplace?

0 Karma

Damien_Dallimor
Ultra Champion

Hyperlinking fixed , although it was showing the correct text. Hard to keep on top of answers from 2012 🙂

0 Karma

kschon_splunk
Splunk Employee
Splunk Employee

Thanks, I can see it now.

0 Karma

igor
Splunk Employee
Splunk Employee

Two questions come to mind: 1) Does "splunk restart" result in a clean shutdown? (You should see a line in splunkd.log that says "loader - All pipelines finished."). 2) Is it the case that the python script that splunkd spawns calls another python script? When splunkd shuts down, it sends SIGTERM to the process that it spawned; but if that process spawned another, that one won't be killed/signaled. That requires special handling.

0 Karma

Damien_Dallimor
Ultra Champion

I also experienced this and have raised the issue.
In the meantime, you can look at how I implemented a workaround for my JMS Messaging Modular Input :
https://github.com/damiendallimore/SplunkModularInputsJavaFramework/tree/master/jms

Basically the mod input script (jms.py) writes a PID file that gets checked upon startup.
Also, the Java program that the jms.py script executes has some simple logic to check whether Splunk is still up , and if not , kills itself.

This ensures that for "splunk start|restart" there will only be 1 mod input process running , and for "splunk stop" there will be zero mod input processes running.

alacercogitatus
SplunkTrust
SplunkTrust

Yep, I do the same thing in My MIs. I check for grandparent pid on non-windows OS and if 1, exit. I wanted to see who else was seeing this, thanks!

0 Karma

Damien_Dallimor
Ultra Champion

With modular inputs you can specify in the scheme whether to run in single instance or multi instance mode. My JMS input mentioned above runs in single instance mode.

0 Karma

csharp_splunk
Splunk Employee
Splunk Employee

I've also considered implementing this in my scripted inputs. It's a good idea for anyone writing them, as the scripted input (and by extension, the modular input) is designed as much to repeatedly execute the same script as to execute a script that stays running. It makes sense if the script is to be a singleton, that the script itself manage ensuring only one version of itself is executing.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...