I am trying to get CPU usage for a specific process in windows. My search looks like this:
host=host1 AND sourcetype="Perfmon:Process" AND counter="% Processor Time" AND process_name="server*" | table _time, counter, process_name, Value
My result is showing mostly 100 for Value which is not really true. Windows runs on VM.
Result looks like this:
2017-09-22T14:40:28.000-0400 % Processor Time server 100
2017-09-22T14:39:43.000-0400 % Processor Time server 100
2017-09-22T14:37:28.000-0400 % Processor Time server 100
2017-09-22T14:32:58.000-0400 % Processor Time server#1 100
2017-09-22T14:32:13.000-0400 % Processor Time server 100
2017-09-22T14:38:13.000-0400 % Processor Time server 100
2017-09-22T14:31:28.000-0400 % Processor Time server#1 11.30968265
2017-09-22T14:21:43.000-0400 % Processor Time server 100
2017-09-22T14:18:43.000-0400 % Processor Time server#1 0.105369743
2017-09-22T14:36:43.000-0400 % Processor Time server 0.034732856
2017-09-22T14:35:58.000-0400 % Processor Time server#1 0.14049302
2017-09-22T14:29:13.000-0400 % Processor Time server 100
2017-09-22T14:28:28.000-0400 % Processor Time server#1 84.84122861
2017-09-22T14:20:58.000-0400 % Processor Time server#1 100
2017-09-22T14:16:28.000-0400 % Processor Time server 100
2017-09-22T14:14:58.000-0400 % Processor Time server#1 100
What should I do? Why is it pulling all the 100s? 80% of events show a 100. Is it an agent config issue?
Perfmon is not very intuitive when watching processor utilization per process. I'm going to assume you're watching process utilization on a multi-processor machine. you may want to use 'System: %Total Processor Time' instead.
Here's some documentation:
https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/monitor-cpu-usage
If you are monitoring a multi-processor machine, and if you're using this counter to monitor CPU utilization, then you're only watching a single thread. generally speaking, 1 thread = 1 core of the machine, so on a 16 core machine 1 core at 100%, is quite normal. When you have 10 instances of 'server#1' running at 100%, you would likely see performance degradation.
thanks @tmarlette, I was having the same issue and the %Total Processor Time helped!
If System: %Total Processor Time is not showing up in splunk - I need to ask my admins to add this statistic then?
Yeah, you would need to as your Splunk admins to add that counter in the Splunk_TA_windows app.
hi - cpu usage (or utilization) has to be over time, so what happens if you try something like this-
host=host1 AND sourcetype="Perfmon:Process" AND counter="% Processor Time" AND process_name="server*" | timechart avg(value)
OR
host=host1 AND sourcetype="Perfmon:Process" AND counter="% Processor Time" AND process_name="server*" | timechart avg(value) by process_name
if this works and you just need other fields,....you just need to eval them AFTER the timechart command or use a simple appendcols in this case
If you do a timechart how are all those false 100s going to affect your average? My concern is not building the chart but underlying data. Based on my observations usage never reached a 100% but splunk keeps bringing it in.
I tested the timechart, btw, and it's staying above 50% on average when in fact it should be around 10%