All Apps and Add-ons

*nix app and CPU Load/Usage for Linux

asarolkar
Builder

I need to create a dashboard which involves capturing Performance Data for Linux instances.

A UF (Universal Forwarder) with the *nix app configured is installed on the Linux instances which pushes data into a central Windows Indexer.




For one of those linux host instances, AMERICA-3, I took the standard query under Splunk>*nix>CPU>CPU By Host > Load Factor and modified it to the following:

index=os sourcetype=vmstat host=* | multikv fields loadAvg1mi | search host="AMERICA-3" | timechart span=5m avg(loadAvg1mi) by host | sort _time

However, this gives me certain values for the loadAverage PER minute that are in excess of 1.000 and 2.000 (loadAvg1mi is usually between 0.000 and 1.0000). I am not sure what loadAvg1mi measures in the context of multiple CPUs and I need to understand that better.






What I am trying to do is the LINUX equivalent of the following CPU Load related metrics for WINDOWS machines (provided also out-of-the-box by Splunk):

sourcetype="Perfmon:CPU Load" | timechart avg(Value) by host | summaryindex spool=t uselb=t addtime=t index="summary" file="winCPULoad_556591856.stash_new" name="winCPULoad" marker=""




Based on my research, there are these other two queries that I could also try:

index=os sourcetype=vmstat host=* | multikv fields memUsedPct | search host="AMERICA-3" | timechart span=5m max(memUsedPct) by host | sort _time

index=os sourcetype=top host=* | multikv fields pctCPU COMMAND | search host="AMERICA-3" COMMAND="java" | timechart span=5m max(pctCPU) by host | sort _time  (I only care for the java process)

Not sure if either query gives us the accurate figure.

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Linux load average as reported by vmstat is not bounded to between 0 and 1 per CPU. A somewhat old (but also very good) discussion is at http://www.linuxjournal.com/article/9001?page=0,1. The load average measurement, while useful, is not equivalent to the Windows Perfmon "CPU Load" counters.

You may find the *nix's app's "percent load by host" to be a more comparable search:

index=os sourcetype=cpu host=*  | multikv fields pctIdle  | eval Percent_CPU_Load = 100 - pctIdle  | timechart avg(Percent_CPU_Load) by host

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

Linux load average as reported by vmstat is not bounded to between 0 and 1 per CPU. A somewhat old (but also very good) discussion is at http://www.linuxjournal.com/article/9001?page=0,1. The load average measurement, while useful, is not equivalent to the Windows Perfmon "CPU Load" counters.

You may find the *nix's app's "percent load by host" to be a more comparable search:

index=os sourcetype=cpu host=*  | multikv fields pctIdle  | eval Percent_CPU_Load = 100 - pctIdle  | timechart avg(Percent_CPU_Load) by host

View solution in original post

sonicZ
Contributor

Hi Dwaddle, do you know of a similar search to gather just the "load average" you get from top?

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!