Deployment Architecture

help on bucket time using

jip31
Motivator

hello

in the query below, I monitore events where the cpu used is > 80%
But I want to monitore only events where the cpu used > 80% last 10 minutes
could you confirm that bucket _time span=10m is doing that?

 `CPU` earliest=-30d latest=now
| bucket _time span=10m 
| where process_cpu_used_percent>80 
| timechart avg(process_cpu_used_percent) as process_cpu_used_percent by host  limit=10 useother=false

Thanks

Tags (1)
0 Karma
1 Solution

DavidHourani
Super Champion

Hi @jip31,

Using | bucket _time span=10m will group events based the time field with a 10 min span. It adds no logic to process_cpu_used_percent so it won't make sense to use process_cpu_used_percent>80 right after the bucket command.

You could do as follows in order to fix the results:

  `CPU` earliest=-30d latest=now
 | bucket _time span=10m 
 | stats avg(process_cpu_used_percent) as process_cpu_used_percent by host,_time
 | where process_cpu_used_percent>80 
 | timechart avg(process_cpu_used_percent) as process_cpu_used_percent by host  limit=10 useother=false

Let me know if that helps.

Cheers,
David

View solution in original post

0 Karma

chinmoya
Communicator

@jip31 :
using span=10 will not work as it accumulates all the events falling in the timefame spefied into 1 event

If you want to restrict the search to last 10 minutes us can use the below

CPU earliest=-10m@m latest=now process_cpu_used_percent>80
| timechart limit=10 useother=false span=10m avg(process_cpu_used_percent) as process_cpu_used_percent by host

Span - will give you output in 10 minutes blocks
limit = will restrict your output to 10.

I would suggest you remove limits as that might truncate your results, if more than 10 hosts go over 80% in the last 10 minutes.

0 Karma

DavidHourani
Super Champion

Hi @jip31,

Using | bucket _time span=10m will group events based the time field with a 10 min span. It adds no logic to process_cpu_used_percent so it won't make sense to use process_cpu_used_percent>80 right after the bucket command.

You could do as follows in order to fix the results:

  `CPU` earliest=-30d latest=now
 | bucket _time span=10m 
 | stats avg(process_cpu_used_percent) as process_cpu_used_percent by host,_time
 | where process_cpu_used_percent>80 
 | timechart avg(process_cpu_used_percent) as process_cpu_used_percent by host  limit=10 useother=false

Let me know if that helps.

Cheers,
David

0 Karma

jip31
Motivator

Hi
There is something wrong...
I have any results....

0 Karma

DavidHourani
Super Champion

what does this give you ?

`CPU` earliest=-30d latest=now
  | bucket _time span=10m 
  | stats avg(process_cpu_used_percent) as process_cpu_used_percent by host,_time
  | where process_cpu_used_percent>80 
0 Karma

jip31
Motivator

it gives no results...
I have results if I put where before stats

0 Karma

DavidHourani
Super Champion

oh... so im guessing in the 10 min interval nothing averages to more than 80... Try max instead of avg and see what it gives.

0 Karma

jip31
Motivator

yes it works now
but.... i need an average on this events
pearharps by using another stats?
like this david??

`CPU` earliest=-30d latest=now 
| bucket _time span=10m 
| stats max(process_cpu_used_percent) as process_cpu_used_percent by host,_time 
| where process_cpu_used_percent>80 
| timechart avg(process_cpu_used_percent) as process_cpu_used_percent by host limit=10 useother=false
0 Karma

DavidHourani
Super Champion

yes exactly, i advise you to use perc90 instead of max for the first stats, since max can simply be a peek and not a real indicator of high load.

Your whole search would look like this :

 `CPU` earliest=-30d latest=now
   | bucket _time span=10m 
   | stats perc90(process_cpu_used_percent) as process_cpu_used_percent by host,_time
   | where process_cpu_used_percent>80 
   | timechart avg(process_cpu_used_percent) as process_cpu_used_percent by host  limit=10 useother=false
0 Karma

jip31
Motivator

OK many thanks

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...