Getting Data In

Alerting on number of times something happens in a timeframe

Michael_Wilde
Splunk Employee
Splunk Employee

I'm monitoring CPU usage on a Windows server. What's the best way to create a search/alert if CPU usage goes over 80% for 30 minutes (as an example)

Tags (2)

Lowell
Super Champion

You could use the transaction command with eval-based start/end conditions. It seems like there could be a better way to do this (I'm curious too), but this approach does seems to work based on a simple test that I ran. (But there could be corner cases, I'm not sure.)

sourcetype="WMI:CPUTime" | transaction host startswith=eval(PercentProcessorTime>=80) endswith=eval(PercentProcessorTime<80) | where duration>=1800

One questions I have is this: Do you want to include in your results a situation where the CPU is say running at 90% for 15 minutes, then it drops to 70% for less that a minute (just long enough for 1 WMI snapshot; perhaps due to a blocking condition) and then returns to 90% for another 20 minutes. Certainly this would seem to fit into the general criteria of what you are trying to find, but but it wouldn't technically match. Some kind of weighted average approach would probably allow this situation to be captured.

Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...