Reporting

How to write a search to report the top VMs out of almost 7000 with an increasing CPU usage trend?

dkoops
Path Finder

I´m looking for a way to determine the slopes of CPU usage over a set timeframe of a large amount of VM´s. I´m able to calculate it for 1 VM but the environment currently contains almost 7000 VM´s.

I would like to have a report showing me the top VM´s with an increasing CPU usage trend.

Tags (5)
0 Karma

dkoops
Path Finder

Thanks for the reply. Unfortunately the suggested search doesn't work.
If I state
| timechart span=1d avg(Value) as yvalue by VMName
It replaces the column name with the specific VM name instead of ´yvalue´ and the following eval functions do not work anymore.

0 Karma

lguinn2
Legend

Would this be better?

index=vcenter_script host=vcenter_statistics Type=VM MetricId=cpu.usage.average
| timechart span=1d avg(Value) as yvalue by VMName
| eventstats count as numevents sum(_time) as sumX sum(yvalue) as sumY sum(eval(_time*yvalue)) as sumXY 
                      sum(eval(_time*_time)) as sumX2 sum(eval(yvalue*yvalue)) as sumY2 by VMName
| eval slope=((numevents*sumXY)-(sumX*sumY))/((numevents*sumX2)-(sumX*sumX))
| eval yintercept= (sumY-(slope*sumX))/numevents
| eval newY=(yintercept + (slope*_time))
| delta newY p=1 as Slp
| stats avg(Slp) as avgSlope by VMName
| sort -avgSlope
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

You'd probably want to replace the timechart with this:

...
| bucket span=1d _time
| stats avg(Value) as yvalue by VMName
...

Then the following calculations should work, add a timechart to the end if required.

Note, this will incorrectly ignore days with zero events for a VMName instead of considering them a zero. If that's relevant for your data then there's a bit more work to be done.

0 Karma

dkoops
Path Finder

Well, for one VM is use the macro "lineartrend(2)" 1 for getting a trendline and then use the 'delta' function to get the slope of the trendline. (the macro already generates a 'slope'-field but that's not the value I want)

Currently I actually do have a function to calculate it for all VM's; having the 'map'-function repeat the macro for all VM's but as you can imagine it's really inefficient. My current query takes about 16 hours to complete, but i'm sure someone knows a more effiecient way..

Query I use:
host=vcenter_platform Type=VM
| dedup VMName | table VMName
| map maxsearches=9999 search="search index=vcenter_script host=vcenter_statistics Type=VM VMName=$VMName$ MetricId=cpu.usage.average
| timechart span=1d avg(Value) as yvalue
| lineartrend(_time,yvalue)
| delta newY p=1 as Slp
| stats avg(Slp) as $VMName$"
| addtotals col=t row=f labelfield=total
| search total=Total
| fields - total
| transpose

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do post how you calculate that for one VM.

0 Karma
Get Updates on the Splunk Community!

Introducing Ingest Actions: Filter, Mask, Route, Repeat

WATCH NOW Ingest Actions (IA) is the best new way to easily filter, mask and route your data in Splunk® ...

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...