Getting Data In

splunk-perfmon.exe errors of Counter is not found

Path Finder

I have noticed that after updating the Universal Forwarder to 7.3.1 (not sure if it is that update or a Windows update) running on Windows 10 Pro (64bit) Version 1809. I get about 2735 of the same type of the following lines around the same time each day in the Forwarders splunkd.log file. Anyone got an idea of how to fix?

08-16-2019 20:56:04.314 -0700 ERROR ExecProcessor - message from "D:\SplunkUniversalForwarder\bin\splunk-perfmon.exe" splunk-perfmon - OutputHandler::composeOutput: Counter is not found: IO Data Bytes/sec
08-16-2019 20:56:04.314 -0700 ERROR ExecProcessor - message from "D:\SplunkUniversalForwarder\bin\splunk-perfmon.exe" splunk-perfmon - OutputHandler::composeOutput: Counter is not found: IO Other Bytes/sec
08-16-2019 20:56:04.314 -0700 ERROR ExecProcessor - message from "D:\SplunkUniversalForwarder\bin\splunk-perfmon.exe" splunk-perfmon - OutputHandler::composeOutput: Counter is not found: % Processor Time
08-16-2019 20:56:04.314 -0700 ERROR ExecProcessor - message from "D:\SplunkUniversalForwarder\bin\splunk-perfmon.exe" splunk-perfmon - OutputHandler::composeOutput: Counter is not found: % User Time
08-16-2019 20:56:04.314 -0700 ERROR ExecProcessor - message from "D:\SplunkUniversalForwarder\bin\splunk-perfmon.exe" splunk-perfmon - OutputHandler::composeOutput: Counter is not found: % Privileged Time
08-16-2019 20:56:04.314 -0700 ERROR ExecProcessor - message from "D:\SplunkUniversalForwarder\bin\splunk-perfmon.exe" splunk-perfmon - OutputHandler::composeOutput: Counter is not found: Page Faults/sec
... and more lines ...

Communicator

Windows servers with the configuration documented in Splunk App for Infrastructure 1.4.1.
They both gather correctly metrics, albeit without prefix, AND generate daily each hundreds of thousands of these error messages.

0 Karma

I know this is a bit old, but any update on this? I'm seeing the same thing in our environment. Multiple versions of Windows (2008 R2, 2012, 2016, 2019), verified the inputs.conf stanzas match typeperf -q and data is collected but I get TONS of "counter not found" errors just like OP.

0 Karma

Update in case anyone else ends up here...our issue was when Process PerfMon counters were enabled.
I opened a support case and Splunk had a JIRA; the issue is resolved in 8.0.2. So the fix is to update to 8.0.2+ UFs on Windows.

In comparing, here is where I see the changes:
in /etc/system/bin/perfmon.cmd:
echo ^
echo ^useWinApiProcStats^
echo ^false^
echo ^false^
echo ^

in inputs.conf.spec:
useWinApiProcStats =
* Whether or not the Performance Monitor input uses process kernel mode and
user mode times to calculate CPU usage for a process, rather than using
the standard Performance Data Helper (PDH) APIs to calculate those values.
* A problem was found in the PDH APIs that causes Performance Monitor inputs
to show maximum values of 100% usage for a process on multicore Windows
machines, even when the process uses more than 1 core at a time.
* When you configure this setting to "true", the input uses the
GetProcessTime() function in the core Windows API to calculate
CPU usage for a process, for the following Performance Monitor
counters, only:
** Processor Time
** User Time
** Privileged Time
* This means that, if a process uses 5 of 8 cores on an 8-core machine, that
the input should return a value of around 500, rather than the incorrect 100.
* When you configure the setting to "false", the input uses the standard
PDH APIs to calculate CPU usage for a process. On multicore systems, the
maximum value that PDH APIs return is 100, regardless of the number of
cores in the machine that the process uses.
* Performance monitor inputs use the PDH APIs for all other Performance
Monitor counters. Configuring this setting has no effect on those counters.
* NOTE: If the Windows machine uses a non-English system locale, and you
have set 'useWinApiProcStats' to "true" for a Performance Monitor input,
then you must also set 'useEnglishOnly' to "true" for that input.
* Default: false

0 Karma

SplunkTrust
SplunkTrust

Review the inputs.conf file on the UF server. Check all of the 'perfmon://' stanzas and verify the counters listed are valid for the server.

You can get a list of counter names by running typeperf -q on the server. The output will look something like this:

\Processor Information(*)\Interrupts/sec
\Processor Information(*)\% Privileged Time
\Processor Information(*)\% User Time
\Processor Information(*)\% Processor Time

The part between the first '\' and (*) is the object name. The part after the second '\' is the counter name.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

Path Finder

The names seem to be correct and these errors only happen once per day! They are all set with an interval of 600 so I would think that if the names were wrong it would complain more often. As an example, the log only shows this for the past few days:

08-16-2019 20:56:03 (Many entries but all in the same second.)
08-17-2019 02:22:33 (Only one entry)
08-18-2019 17:43:02 (Only one entry)
08-19-2019 18:16:01 (Many entries again!)

Its almost like windows is not giving up the information - I just don't know how that is obtained.
If I look at the date and time of these errors in Search (Looking at an hour of time selecting a "not found" counter ) I see a dip in the number of events for that minute ONLY.

Any ideas where to go from there?

0 Karma

SplunkTrust
SplunkTrust

Sorry, I don't.

---
If this reply helps you, an upvote would be appreciated.
0 Karma