Alerting

How to create and trigger an alert if the CPU usage is constantly 100% for the past 10 minutes?

akash5333
New Member

Hello,

We have both Windows and Linux environments. We want to set up an alert to send an email if the CPU usage of a particular process is constantly 100% during past 10 minutes. Below is the search I have for the CPU usage:

Linux:

host=yyyy index=* COMMAND=java USER=xxxxxx | timechart span=10m limit=0 avg(pctCPU) as "% of CPU Usage"

Windows:

index=* host=zzzz sourcetype="Perfmon:CPU" source="Perfmon:CPU" counter="% Processor Time" | timechart span=10m limit=0 avg(Value) as "% of CPU Usage"
0 Karma
1 Solution

JMichaelis
Path Finder

You can use a real-time alert with a rolling window of 10 minutes with the following search:

Linux:

host=yyyy index=* COMMAND=java USER=xxxxxx | stats avg(pctCPU) as CPUUsage | where CPUUsage = 100

Windows:

index=* host=zzzz sourcetype="Perfmon:CPU" source="Perfmon:CPU" counter="% Processor Time" | stats avg(value) as CPUUsage | where CPUUsage = 100

These searches create a result when the avg is at 100, which can only be the case if it has been at a constant 100%.
You then can use the "Per-Result" trigger of the real time alert which triggers if the search returns results.

View solution in original post

JMichaelis
Path Finder

You can use a real-time alert with a rolling window of 10 minutes with the following search:

Linux:

host=yyyy index=* COMMAND=java USER=xxxxxx | stats avg(pctCPU) as CPUUsage | where CPUUsage = 100

Windows:

index=* host=zzzz sourcetype="Perfmon:CPU" source="Perfmon:CPU" counter="% Processor Time" | stats avg(value) as CPUUsage | where CPUUsage = 100

These searches create a result when the avg is at 100, which can only be the case if it has been at a constant 100%.
You then can use the "Per-Result" trigger of the real time alert which triggers if the search returns results.

frobinson_splun
Splunk Employee
Splunk Employee

Hi @akash5333,
Try creating a real-time alert with rolling time window triggering. This will let you monitor for conditions that occur within a particular time window (in this case, CPU usage in a 10 minute span).

See
http://docs.splunk.com/Documentation/Splunk/6.3.3/Alert/Definerolling-windowalerts

Hope this helps!

0 Karma

akash5333
New Member

Hi @frobinson,

Here are my output of my query in the span of 10 minutes, I have set an rolling alert to send email if CPUusage is more than 10 but I never received the alert. Please let me know where I am going wrong.

2016-03-04 09:50:00
1.9
13.6
27.3
3.0
54.6

0 Karma

frobinson_splun
Splunk Employee
Splunk Employee

Hi @akash5333,
What are your trigger conditions? Are you throttling the alert at all?

0 Karma

akash5333
New Member

Hi @frobinson,

Yes I have set the throttle for 10 seconds. Here is trigger condition.

Realtime Alert - search pctCPU>10 - in 10 seconds

0 Karma

frobinson_splun
Splunk Employee
Splunk Employee

Thanks--taking a look and I'll get back to you soon!

0 Karma

frobinson_splun
Splunk Employee
Splunk Employee

Hi @akash5333,
I'm not sure which query you are using. Is it one of the original queries you posted or the suggested queries in this post? I think there may be a couple problems with the trigger condition. It sounds like your query renames the average CPU percentage but your trigger condition is checking a field in the original event data.

Keep in mind that a custom trigger condition is a secondary search applied to your base query's results. So you might need to double-check the query result fields to make sure you are using the right fields in the trigger condition.

Also, I'm not sure that the "pctCPU>10" and "in 10 seconds" part of the condition match the alert scenario you mentioned at first. This might be something to double-check too.

Have you tried the suggested queries from @JMichaelis? They might match the scenario you want more closely.

Hope this helps!

Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...