Alerting on number of times something happens in a...

Michael_Wilde · ‎06-28-2010

I'm monitoring CPU usage on a Windows server. What's the best way to create a search/alert if CPU usage goes over 80% for 30 minutes (as an example)

Lowell · ‎06-28-2010

You could use the transaction command with eval-based start/end conditions. It seems like there could be a better way to do this (I'm curious too), but this approach does seems to work based on a simple test that I ran. (But there could be corner cases, I'm not sure.)

sourcetype="WMI:CPUTime" | transaction host startswith=eval(PercentProcessorTime>=80) endswith=eval(PercentProcessorTime<80) | where duration>=1800

One questions I have is this: Do you want to include in your results a situation where the CPU is say running at 90% for 15 minutes, then it drops to 70% for less that a minute (just long enough for 1 WMI snapshot; perhaps due to a blocking condition) and then returns to 90% for another 20 minutes. Certainly this would seem to fit into the general criteria of what you are trying to find, but but it wouldn't technically match. Some kind of weighted average approach would probably allow this situation to be captured.

Alerting on number of times something happens in a timeframe

Splunk Custom Visualizations App End of Life

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk