Alerting

How to check if load is equally distributed on the host and create an alert?

Vicky84
Explorer

Hi,
We generally raise tickets in Prod through Splunk by putting search query as Report/Alert and now we have a requirement to alert if the load is not equally distributed b/w the hosts. With the top command I see result is in % but I wasn't able to use it in where cause to calculate the deviation.

Say we have 4 hosts sharing an app and ideally it should be almost equal distribution but in unwanted scenario if load is lesser in Prod on one of the host Or higher on a host, I should have an alert.

log ex : index=data loggerName="xyzzy" threadName="thread1" appName="dataSync"

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try. I've used 1 percent as the threshould difference between a host's percent versus average percent (100/total hosts).

index=data loggerName="xyzzy" threadName="thread1" appName="dataSync"
| top host showperc=t showcount=f | eventstats count 
|eval average=100/count 
| where percent<average-1 OR percent>average+1

View solution in original post

asplunk789
Loves-to-Learn Everything

Is it possible the same way for 100's of servers (different servers like app servers, db servers etc..) comparison.

0 Karma

mattymo
Splunk Employee
Splunk Employee

Hi Vicky84,

I would recommend looking at collecting host metrics using something like collectd or the nix_ta or nmon, etc, rather than top, so you can get the CPU trend over time. then you could compare the trends and calculate a deviation

- MattyMo
0 Karma

Vicky84
Explorer

May be in a larger context what you are referring may mean more sense and to monitor OS stats but I am not well versed in that and something like below Splunk query would do the task for me.

0 Karma

somesoni2
Revered Legend

Give this a try. I've used 1 percent as the threshould difference between a host's percent versus average percent (100/total hosts).

index=data loggerName="xyzzy" threadName="thread1" appName="dataSync"
| top host showperc=t showcount=f | eventstats count 
|eval average=100/count 
| where percent<average-1 OR percent>average+1

Vicky84
Explorer

Exactly as I wanted !

0 Karma
Get Updates on the Splunk Community!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...