I've found a bit of a bug with the ps_sos.sh script that's a part of this app.
After a couple of restarts of our Splunk server, I noticed that the memory footprint shown in the SOS app was changing dramatically.
There are 2 splunkd processes that run on the server (splunkd & splunk helpers), and the script only returns the resource usage of the first of those 2 processes. Upon restarts it appears to change which one it returns, so the resource usage changes dramatically.
I'm running on a 64bit Linux server. Is this a known issue?
Otherwise, great app!
12/06 UPDATE: We have made some changes in version 2.3.1 of the S.o.S app that should prevent this problem from happening. The aggregated CPU and memory usage statistics for type "searches" will now account for both the main and helper processes of every search.
Thank you for the feedback, Ashley. I have not seen that happen in our tests. We use the first PID in the $SPLUNK_HOME/var/run/splunk/splunkd.pid
file to pick the ps row corresponding to the main splunkd process. Would you be able to tell us if on your system that first PID is sometimes that of the helper process and not the main splunkd?