Getting Data In

Create an alert based on CPU being at X% for a span of X minutes

rahulkumarfgf
Explorer

Hi Everyone!
I have researched this issue and found a few solutions, though not completely. I followed this link:
https://answers.splunk.com/answers/557838/create-an-alert-based-on-cpu-being-at-95-for-a-spa.html

and wanted to know if I can use "%_Processor_Time" instead of CPUPct as I am not able to extract "CPUPct" field.

Also, I followed this link: https://answers.splunk.com/answers/693250/how-do-i-alert-if-cpu-is-greater-than-97-for-more.html

here, I wanted to understand what does "instance=Total" mean?

Also, which one of the accepted answers is better to use? The queries I used are as follows:
SPL 1:

index="perfmoncpu" | bin _time span=1m

| stats max(%_Processor_Time) as PercentProcessorTime by host _time
| eval PercentProcessorTime = round(PercentProcessorTime, 2)
| eval overload = if(PercentProcessorTime >= 90, 1, 0)
|streamstats current=f last(overload) as prevload by host
|eval newgroup=case(isnull(prevload),1, prevload!=overload,1, true(),0)
|streamstats sum(newgroup) as groupno by host
|eventstats count as LoadDuration by host groupno
| where overload = 1 and LoadDuration >= 10
| table host _time PercentProcessorTime LoadDuration

SPL 2:

index="perfmoncpu" source="PerfmonMk:CPU" instance=_Total
| sort 0 _time
| streamstats time_window=15min avg(cpu_load_percent) as last15min_load count by host
| eval last15min_load = if (count < 90,null,round(last15min_load, 2))
| where (last15min_load) >= 90
| table host, cpu_load_percent, last15min_load

I have used count<90 as the above SPL generates a count of 90 mins throughout

Please let me know if you guys have any further questions.

Thank You!

PS: I am a newbie trying to learn splunk!

0 Karma

to4kawa
Ultra Champion

ans1:
instance=_Total is instance field has _Total string(value).

ans2:
%_Processor_Time can be used by field name.

ans3:
try both and check job inspector.

If you provide sample logs, we can make query.

0 Karma

rahulkumarfgf
Explorer

Hi @to4kawa ,
Thank you for your response. I am using both, however, am not sure what exactly to check in job inspector that will give me the idea that the SPL is correct.

Regarding logs, am trying to find a way to submit them. I will try and add a link to it.

Thank you!

0 Karma

to4kawa
Ultra Champion

Shorter elapsed times are better queries.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...