Solved: How to measure by-host "imbalance"?

stucky101 · ‎06-03-2022

Gurus

I have an infoblox query that simply measures total amount of queries over a certain period by host for a given infoblox cluster. They are usually pretty uneven (25% : 75%)

I can use that in a pie chart easily. However, I'm also interested in measuring the "imbalance factor" so that I can rank clusters by most/least imbalanced. I have no clue where to start since I'd need 2 values to do math with but "count" isn't even a field.

Is this possible ?

Thx

thx

ITWhisperer · ‎06-06-2022

Try something like this

| stats count as load by host
| eval cluster=mvindex(split(host,"-"),0)
| eventstats sum(load) as total by cluster
| stats range(load) as imbalance values(total) as total by cluster
| eval imbalance=round(100*imbalance/total,2)

View solution in original post

ITWhisperer · ‎06-03-2022

What information do you have? Can you share some sample events (preferably in a code block </>)?

stucky101 · ‎06-04-2022

Hi ITWhisperer

The query itself doesn't matter at all. Any stats count by host for anything. Here an example.

I have pairs of infoblox nodes in a global grid and I wanna see how many queries each node served over the last 1h per regional cluster.

{dns query} host="ibcluster01-*" | stats count by host

This spits out 2 values and displays them per host. For easy math say the 2 counters are 25 and 75. Quite imbalanced.

Now I run it against another cluster

{dns query} host="ibcluster02-*" | stats count by host

This time the 2 values are 45 and 55, quite balanced.

I wanted to measure the imbalance factor for each cluster so I can rank them. I was thinking measuring the gab between the 2 would do the trick.

Cluster 01: 75 - 50 = imbalance of 50

Cluster 02 : 55 - 45 = imbalance of 10 (winner)

I could assign that number to a var called "imbalance" and now I could do a cluster ranking by most or least imbalanced by doing a "top 5 imbalance". That's the theory 🙂

But I can't help feeling there is already something fancy in SPL but I cannot find anything. Even if I did the math myself, wouldn't I need the 2 counters as fields so I can do further evals on them ?

I hope this clarifies my question.

ITWhisperer · ‎06-04-2022

Assuming, as per your example, host names begin with the cluster name followed by a hyphen, you could do something like this

| stats count as load by host
| eval cluster=mvindex(split(host,"-"),0)
| stats range(load) as balance by cluster

stucky101 · ‎06-04-2022

I'm not home right now so doing this from phone. The actual hostnames are a bit more complex but this might still work. The names usefollowing standard:

site-ibsn-cxnx.domainname

F.e. A site called "sjca" has these 2 hosts:

sjca-ibsn-c01n01.domain.com

sjca-ibsn-c01n02.domain.com

The at site "laxa"

laxa-ibsn-c01n01.domain.com

laxa-ibsn-c01n02.domain.com

etc...I'll take a look at your suggestion when I can. Thank you for your help so far !

stucky101 · ‎06-06-2022

So I tried this and yes, the range calculates well. However, I realized that this produces a value that is specific to this cluster only. It cannot be compared to the ranges of other clusters as they have a completely different number of hits within the same timeframe. F.e.

A cluster that is 9:1 has a range of 8 (very imbalanced) while a cluster that is 60:40 has a range of 20. This would make it look more imbalanced when the opposite is true. I tried to apply some simple math after but it appears "load" loses its value right after the range calculation for some reason.

stats range(load) as balance by cluster | eval result=(balance/load*100) | table cluster,result

This produces nothing for "result". Yet, when I replace "load" with an arbitrary number it produces a value

stats range(load) as balance by cluster | eval result=(balance/30*100) | table cluster,result

In fact, I can see that "load" loses its value right after doing the range calc on it.

stats range(load) as balance by cluster | table cluster,load

I presume the stats range function does that but how do I get these ranges turned into percentages for cross cluster comparison ?

Thx

ITWhisperer · ‎06-06-2022

Try something like this

| stats count as load by host
| eval cluster=mvindex(split(host,"-"),0)
| eventstats sum(load) as total by cluster
| stats range(load) as imbalance values(total) as total by cluster
| eval imbalance=round(100*imbalance/total,2)

stucky101 · ‎06-06-2022

You da man ! This deserves an upvote ! Thank you very much ! You saved my day.

I do have a question. I had almost figured out something similar and it looked great with ONE major problem.

I had this:

Worked properly on all but a single cluster. On there I got percentage values over 100 :(.

I know the problem with stats is that it only returns the values it has created. I saw a post about that where someone suggested the trick using "first" to preserve the load value. I tried that but it turns out it only works when the cluster is imbalanced one way (n01 has more load than n02). I have this one cluster where that is reversed and the stats produce a reverse percentage here. Would there have been a way to make my query work for both cases as well ? Just curious.

ITWhisperer · ‎06-06-2022

Instead of first(load), you could have used max(load)

stucky101 · ‎06-07-2022

Hmmm so these 2 queries do not produce the same results. Similar enough to get fooled initially, but not the same. When I check the top 10 with this:

I get:

lon03 100
gga 98
sms 97
tyoa 93
au01 92
rb01 90
swldmz 89
par 86
bp01 85
sf01 84

When I use this query:

I get:

lon03 100
gga 96
sms 94
tyoa 86
au01 85
rb01 82
swldmz 80
par 76
bp01 74
sf01 72

Some with a 10point difference.

The percentage math is the same and we're both rounding. I wonder if both these queries can be considered equal. If so, should they not produce the same (or near same) numbers over the same timespan ?

thx

ITWhisperer · ‎06-07-2022

They are not the same calculation - one calculates the imbalance based on the difference in load as a percentage of the maximum load, whereas the other calculates the imbalance based on the difference in load as a percentage of the total load. Both can be considered as measures of imbalance but are equal at only two points, complete balance and complete imbalance.

Have a look at this runanywhere example to see the difference between the two ways of calculating the imbalance.

| makeresults count=51
| fields - _time
| streamstats count as row
| eval loadA=row - 1
| eval loadB=100 - loadA
| eval range=abs(loadA - loadB)
| eval imbalanceA=round(100*range/max(loadA, loadB),2)
| eval imbalanceB=round(100*range/(loadA + loadB),2)
| table loadA loadB range imbalanceA imbalanceB

So, it comes down to what is it you are trying to show as your measure of imbalance.

How to measure by-host "imbalance"?

eval

stats

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!