All Apps and Add-ons

CPU Load on AIX - getting different values from TA-nmon (events), TA-metricator-for-nmon (metrics) for the same host

alexeyglukhov
Path Finder

Hello all,

Two Nmon TAs on one AIX host.
TA-nmon (based on events) showing CPU load higher than TA-metricator-for-nmon (based on metrics).

I need help to figure out what might cause this.

SPL on data from TA-nmon (events):

| tstats 
avg(CPU.UPTIME.load_average_1min) AS load_average_1min 
avg(CPU.UPTIME.load_average_5min) AS load_average_5min 
avg(CPU.UPTIME.load_average_15min) AS load_average_15min 
from datamodel=NMON_Data_CPU 
where (nodename = CPU.UPTIME) host= 
groupby _time host prestats=true `nmon_span`
| timechart `nmon_span` 
avg(CPU.UPTIME.load_average_1min) AS load_average_1min 
avg(CPU.UPTIME.load_average_5min) AS load_average_5min 
avg(CPU.UPTIME.load_average_15min) AS load_average_15min

SPL on data from TA-metricator-for-nmon (metrics):

| mstats avg(_value) as value where `nmon_metrics_index` 
(metric_name=os.unix.nmon.system.uptime.load_average_1min OR 
metric_name=os.unix.nmon.system.uptime.load_average_5min OR 
metric_name=os.unix.nmon.system.uptime.load_average_15min) 
host= by metric_name `nmon_span` 
| `extract_metrics("load_average_1min load_average_5min load_average_15min")`
| fillnull value=0 load_average_1min load_average_5min load_average_15min 
| timechart `nmon_span` 
avg(load_average_1min) as load_average_1min 
avg(load_average_5min) as load_average_5min 
avg(load_average_15min) as load_average_15min

alt text

Update:
Continuing my investigation.
Source is output from "uptime" command which is exactly the same for both events and metrics TAs.

From
/opt/splunkforwarder/etc/apps/(TA-metricator-for-nmon OR TA-nmon)/bin/nmon_external_cmd/nmon_external_snap.sh

# Uptime information (uptime command output)
echo "UPTIME,$1,\"`uptime | sed 's/^\s//g' | sed 's/,/;/g'`\"" >>NMON_FIFO_PATH/nmon_external.dat &

Turns out that data based on events is accurate, but metric values are exactly 3 times less (attached screenshot).
alt text

0 Karma
1 Solution

Melstrathdee
Path Finder

Hey Alexey,
Remove live 7 in your SPL

fillnull value=0 load_average_1min load_average_5min load_average_15min

The fillnull is splitting it into 3 events and then the timechart is generating the average across these ( as two of them have a 0 value it is essentially dividing by 3).

View solution in original post

0 Karma

Melstrathdee
Path Finder

Hey Alexey,
Remove live 7 in your SPL

fillnull value=0 load_average_1min load_average_5min load_average_15min

The fillnull is splitting it into 3 events and then the timechart is generating the average across these ( as two of them have a 0 value it is essentially dividing by 3).

0 Karma

alexeyglukhov
Path Finder

Indeed, this fixed the problem. Thank you very much, Mel !
So, Metricator for NMON -> "UPTIME Load Average" panel's SPL is to be slightly corrected, working version:

| mstats avg(_value) as value where `nmon_metrics_index` 
 (metric_name=os.unix.nmon.system.uptime.load_average_1min OR 
 metric_name=os.unix.nmon.system.uptime.load_average_5min OR 
 metric_name=os.unix.nmon.system.uptime.load_average_15min) 
 host= by metric_name `nmon_span` 
 | `extract_metrics("load_average_1min load_average_5min load_average_15min")`
 | timechart `nmon_span` 
 avg(load_average_1min) as load_average_1min 
 avg(load_average_5min) as load_average_5min 
 avg(load_average_15min) as load_average_15min
0 Karma

to4kawa
Ultra Champion

SPL on data from TA-nmon (events):

add span in tstats

0 Karma

alexeyglukhov
Path Finder

Hi @to4kawa
Thanks for the suggestion, my aim is to migrate from event based dashboards to metrics based, that's why I am currently checking if metrics are providing exactly the same data as events.

0 Karma
Get Updates on the Splunk Community!

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...

Splunkbase | Splunk Dashboard Examples App for SimpleXML End of Life

The Splunk Dashboard Examples App for SimpleXML will reach end of support on Dec 19, 2024, after which no new ...

Understanding Generative AI Techniques and Their Application in Cybersecurity

Watch On-Demand Artificial intelligence is the talk of the town nowadays, with industries of all kinds ...