Hello,
I am facing strange issue with a Splunk Forwarder where on some servers of the same role is CPU usage 0-3% and the others are around 15%. It doesn't sound bad on the 1st hand, but it did cause us issues with deployment and such behavior is dangerous for live services if it will grow.
It started around 3 weeks ago with installed 9.3.0 on the Windows Server 2019 VMs with 8 CPU cores and 24GB RAM. I did update Forwarder to the 9.3.1 and the behavior is the same.
For example, we have 8 servers with the same setup and apps running on, traffic to them is load balanced and very similar, log files amount and size is also very similar. 5 servers are affected, 3 not.
All of them have set 10 inputs, from what are 4 perfmonitors (CPU,RAM,Disk space and Web Services) and 6 inputs are checking around 40 log files.
Any sugestion what to check to understand what is happening?
Start in the DMC to do CPU performance comparison on the various instances or try this search.
index=_introspection host=<replace-with-hostname> sourcetype=splunk_resource_usage component=PerProcess "data.pct_cpu"="*"
| rename data.* as *
| eval processes=process_type.":".process.":".args
| timechart span=10s max(pct_cpu) as pct_cpu by processes
This is assuming HF, you didn't specify but if it's UF there is something similar just a bit different.