Splunk Search

fix cpu by process search

dinisco
Explorer

The *nix app has a cpu by process search that doesn't work under certain conditions:

index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | timechart avg(pctCPU) by COMMAND

The problem is that if there are multiple processes running with the same command name in a single event, this will average them. So 5 x foo processes, each consuming 3% cpu returns foo=3% when it's actually 15%.

I fixed this by combining COMMAND with PID making it unique:

index="os" sourcetype="ps" host=$host$| multikv fields pctCPU, COMMAND, PID| strcat COMMAND "_" PID cmd | where pctCPU>0 | timechart avg(pctCPU) by cmd limit=0

but this is messy for systems with 50+ processes with the same COMMAND name and firefox doesn't seem to like limit=0.

Ideally I could sum pctCPU within the event for all COMMANDS of the same name. This would result in a single line on the chart for foo that shows 15% instead of 5 x lines that show foo_$pid at 3%. Is this possible?

Tags (3)
0 Karma
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

You're right. This might do it:

index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | stats sum(pctCPU) as pctCPU by _time,COMMAND | timechart avg(pctCPU) by COMMAND

i.e., sum the CPU up for each command at each measurement (i.e. that share the same _time) before you bucket and average.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

You're right. This might do it:

index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | stats sum(pctCPU) as pctCPU by _time,COMMAND | timechart avg(pctCPU) by COMMAND

i.e., sum the CPU up for each command at each measurement (i.e. that share the same _time) before you bucket and average.

dinisco
Explorer

that did it, so simple. Thank you kindly, much appreciated.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...