Monitoring Splunk

Number of appserver.py processes increasing, causing OOM

hrawat
Splunk Employee
Splunk Employee

Search Head appears to have a rogue python  process ( appserver.py) that slowly eats away all memory on the system, then eventually causes an OOM, which requires a manual restart of splunkd, then the issue starts slowly creeping up to happen again.

Labels (1)
0 Karma
1 Solution

hrawat
Splunk Employee
Splunk Employee

Due to some issue with proper cleanup of idle processes, number of python process ( appserver.py) running on the system constantly grow. Thus due to  systemwide memory growth,  these stale processes, eventually causes an OOM.

Run following search to find if any search head is impacted by this issue and what % of total system memory these stale processes running more than 24 hours. If these processes using more than 15% of total system memory, then run script to kill stales processes.

 

index=_introspection host=<all search heads>  appserver.py data.elapsed > 86400
| dedup host, data.pid
| stats dc(data.pid) as cnt sum("data.pct_memory") AS appserver_memory_used by  host
| sort - appserver_memory_used

 



On linux/unix you can use following script to kill stale processes and reclaim memory.

 

kill -TERM  $(ps -eo etimes,pid,cmd | awk '{if ( $1 >= 86400) print $2 " " $4 }' |grep appserver.py | awk '{print $1}')

 



View solution in original post

0 Karma

waechtler_amaso
Explorer

I see this behaviour, too, also for another process coming from the ITSI app:

  /opt/splunk/etc/apps/SA-ITOA/bin/command_health_monitor.py

Besides killing processes or restarting splunk as a workaround, do you know whether there are efforts to finally resolve this bug?

Thanks, Jan

 

0 Karma

hrawat
Splunk Employee
Splunk Employee

Splunk 9.3.0 has the fix.

hrawat
Splunk Employee
Splunk Employee

Due to some issue with proper cleanup of idle processes, number of python process ( appserver.py) running on the system constantly grow. Thus due to  systemwide memory growth,  these stale processes, eventually causes an OOM.

Run following search to find if any search head is impacted by this issue and what % of total system memory these stale processes running more than 24 hours. If these processes using more than 15% of total system memory, then run script to kill stales processes.

 

index=_introspection host=<all search heads>  appserver.py data.elapsed > 86400
| dedup host, data.pid
| stats dc(data.pid) as cnt sum("data.pct_memory") AS appserver_memory_used by  host
| sort - appserver_memory_used

 



On linux/unix you can use following script to kill stale processes and reclaim memory.

 

kill -TERM  $(ps -eo etimes,pid,cmd | awk '{if ( $1 >= 86400) print $2 " " $4 }' |grep appserver.py | awk '{print $1}')

 



0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...