All Apps and Add-ons

*nix app and CPU Load/Usage for Linux

asarolkar
Builder

I need to create a dashboard which involves capturing Performance Data for Linux instances.

A UF (Universal Forwarder) with the *nix app configured is installed on the Linux instances which pushes data into a central Windows Indexer.




For one of those linux host instances, AMERICA-3, I took the standard query under Splunk>*nix>CPU>CPU By Host > Load Factor and modified it to the following:

index=os sourcetype=vmstat host=* | multikv fields loadAvg1mi | search host="AMERICA-3" | timechart span=5m avg(loadAvg1mi) by host | sort _time

However, this gives me certain values for the loadAverage PER minute that are in excess of 1.000 and 2.000 (loadAvg1mi is usually between 0.000 and 1.0000). I am not sure what loadAvg1mi measures in the context of multiple CPUs and I need to understand that better.






What I am trying to do is the LINUX equivalent of the following CPU Load related metrics for WINDOWS machines (provided also out-of-the-box by Splunk):

sourcetype="Perfmon:CPU Load" | timechart avg(Value) by host | summaryindex spool=t uselb=t addtime=t index="summary" file="winCPULoad_556591856.stash_new" name="winCPULoad" marker=""




Based on my research, there are these other two queries that I could also try:

index=os sourcetype=vmstat host=* | multikv fields memUsedPct | search host="AMERICA-3" | timechart span=5m max(memUsedPct) by host | sort _time

index=os sourcetype=top host=* | multikv fields pctCPU COMMAND | search host="AMERICA-3" COMMAND="java" | timechart span=5m max(pctCPU) by host | sort _time  (I only care for the java process)

Not sure if either query gives us the accurate figure.

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Linux load average as reported by vmstat is not bounded to between 0 and 1 per CPU. A somewhat old (but also very good) discussion is at http://www.linuxjournal.com/article/9001?page=0,1. The load average measurement, while useful, is not equivalent to the Windows Perfmon "CPU Load" counters.

You may find the *nix's app's "percent load by host" to be a more comparable search:

index=os sourcetype=cpu host=*  | multikv fields pctIdle  | eval Percent_CPU_Load = 100 - pctIdle  | timechart avg(Percent_CPU_Load) by host

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

Linux load average as reported by vmstat is not bounded to between 0 and 1 per CPU. A somewhat old (but also very good) discussion is at http://www.linuxjournal.com/article/9001?page=0,1. The load average measurement, while useful, is not equivalent to the Windows Perfmon "CPU Load" counters.

You may find the *nix's app's "percent load by host" to be a more comparable search:

index=os sourcetype=cpu host=*  | multikv fields pctIdle  | eval Percent_CPU_Load = 100 - pctIdle  | timechart avg(Percent_CPU_Load) by host

sonicZ
Contributor

Hi Dwaddle, do you know of a similar search to gather just the "load average" you get from top?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...