Splunk Search

fix cpu by process search

Explorer

The *nix app has a cpu by process search that doesn't work under certain conditions:

index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | timechart avg(pctCPU) by COMMAND

The problem is that if there are multiple processes running with the same command name in a single event, this will average them. So 5 x foo processes, each consuming 3% cpu returns foo=3% when it's actually 15%.

I fixed this by combining COMMAND with PID making it unique:

index="os" sourcetype="ps" host=$host$| multikv fields pctCPU, COMMAND, PID| strcat COMMAND "_" PID cmd | where pctCPU>0 | timechart avg(pctCPU) by cmd limit=0

but this is messy for systems with 50+ processes with the same COMMAND name and firefox doesn't seem to like limit=0.

Ideally I could sum pctCPU within the event for all COMMANDS of the same name. This would result in a single line on the chart for foo that shows 15% instead of 5 x lines that show foo_$pid at 3%. Is this possible?

Tags (3)
0 Karma
1 Solution

Splunk Employee
Splunk Employee

You're right. This might do it:

index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | stats sum(pctCPU) as pctCPU by _time,COMMAND | timechart avg(pctCPU) by COMMAND

i.e., sum the CPU up for each command at each measurement (i.e. that share the same _time) before you bucket and average.

View solution in original post

Splunk Employee
Splunk Employee

You're right. This might do it:

index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | stats sum(pctCPU) as pctCPU by _time,COMMAND | timechart avg(pctCPU) by COMMAND

i.e., sum the CPU up for each command at each measurement (i.e. that share the same _time) before you bucket and average.

View solution in original post

Explorer

that did it, so simple. Thank you kindly, much appreciated.

0 Karma