Solved: detemine cpu which has higher load than average

jhuysing · ‎06-05-2024

I can create a query and produce a time chart so I can see the load across the set of cpu

|timechart values(VALUE) span=15m by cpu  limit=0

I can see a trend that one cpu has a higher loader

I can also create a query using the stats to get the avg/Max/Range of the load value

stats max(VALUE) as MaxV,  mean(VALUE) as MeanV,  range(VALUE) as Delta by _time

What I want to do is identify any CPU that's running a higher load than avg plus some sort of fiddle factor?

ITWhisperer · ‎06-05-2024

It is not clear what your actual requirement is - Which avg are you want to compare to? The average VALUE for that time period (15m) across all cpus, or the average for that cpu across the whole time period?

Assuming the former, a "standard" way of looking for a "fiddle factor" is to determine the standard deviation (for the VALUEs in the time period - 15m), and determine for each cpu how many stdevs the VALUE is above the mean. You might do this like this

| eventstats mean(VALUE) as MeanV stdev(VALUE) as StDevV by _time
| eval exceedFactor=if(VALUE > MeanV,(VALUE - MeanV)/StDevV, 0)
| timechart values(exceedFactor) span=15m by cpu limit=0

View solution in original post

ITWhisperer · ‎06-05-2024

It is not clear what your actual requirement is - Which avg are you want to compare to? The average VALUE for that time period (15m) across all cpus, or the average for that cpu across the whole time period?

Assuming the former, a "standard" way of looking for a "fiddle factor" is to determine the standard deviation (for the VALUEs in the time period - 15m), and determine for each cpu how many stdevs the VALUE is above the mean. You might do this like this

| eventstats mean(VALUE) as MeanV stdev(VALUE) as StDevV by _time
| eval exceedFactor=if(VALUE > MeanV,(VALUE - MeanV)/StDevV, 0)
| timechart values(exceedFactor) span=15m by cpu limit=0

jhuysing · ‎06-05-2024

no, that not right
putting the cpu into by clause for the stats command doesn't give the mean value for cluster
Its performing the stats on the individual cpu's

gcusello · ‎06-05-2024

Hi @jhuysing ,

I don't know which data are you monitoring, but anyway, youcan add the CPU name to the stats BY clause.

Then you can create your own rule to fire an alert: e.g. max value more than 30% of the average, etc... using a where condition.

in your case (using 30% more than MeanV):

<your_search>
| bin span=15m _time
| stats 
     max(VALUE) AS MaxV
     mean(VALUE) AS MeanV
     range(VALUE) AS Delta 
     BY _time CPU
| where MaxV>MeanV*1.3

If you use _time in the stats command, remember to add the bin command before.

Ciao.

Giuseppe

detemine cpu which has higher load than average

stats

subsearch

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Splunk Community Badges!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

Join the Conversation