Splunk Search

How to measure by-host "imbalance"?

stucky101
Engager

Gurus

I have an infoblox query that simply measures total amount of queries over a certain period by host for a given infoblox cluster. They are usually pretty uneven (25%  : 75%)

I can use that in a pie chart easily. However, I'm also interested in measuring the "imbalance factor" so that I can rank clusters by most/least imbalanced. I have no clue where to start since I'd need 2 values to do math with but "count" isn't even a field.

Is this possible ?

Thx

thx

Labels (2)
Tags (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Try something like this

| stats count as load by host
| eval cluster=mvindex(split(host,"-"),0)
| eventstats sum(load) as total by cluster
| stats range(load) as imbalance values(total) as total by cluster
| eval imbalance=round(100*imbalance/total,2)

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

What information do you have? Can you share some sample events (preferably in a code block </>)?

0 Karma

stucky101
Engager

Hi ITWhisperer

The query itself doesn't matter at all. Any stats count by host for anything. Here an example.

I have pairs of infoblox nodes in a global grid and I wanna see how many queries each node served over the last 1h per regional cluster.

{dns query} host="ibcluster01-*" | stats count by host

This spits out 2 values and displays them per host. For easy math say the 2 counters are 25 and 75. Quite imbalanced.

Now I run it against another cluster

{dns query} host="ibcluster02-*" | stats count by host 

This time the 2 values are 45 and 55, quite balanced.

I wanted to measure the imbalance factor for each cluster so I can rank them. I was thinking measuring the gab between the 2 would do the trick. 

Cluster 01: 75 - 50 = imbalance of 50

Cluster 02 : 55 - 45 = imbalance of 10 (winner)

I could assign that number to a var called "imbalance" and now I could do a cluster ranking by most or least imbalanced by doing a "top 5 imbalance". That's the theory 🙂

 

But I can't help feeling there is already something fancy in SPL but I cannot find anything. Even if I did the math myself, wouldn't I need the 2 counters as fields so I can do further evals on them ? 

I hope this clarifies my question.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming, as per your example, host names begin with the cluster name followed by a hyphen, you could do something like this

| stats count as load by host
| eval cluster=mvindex(split(host,"-"),0)
| stats range(load) as balance by cluster
0 Karma

stucky101
Engager

I'm not home right now so doing this from phone. The actual hostnames are a bit more complex but this might still work. The names usefollowing standard: 

 

site-ibsn-cxnx.domainname

F.e. A site called "sjca" has these 2 hosts:

 

sjca-ibsn-c01n01.domain.com

sjca-ibsn-c01n02.domain.com

 

The at site "laxa"

laxa-ibsn-c01n01.domain.com

laxa-ibsn-c01n02.domain.com

 

etc...I'll take a look at your suggestion when I can. Thank you for your help so far !

0 Karma

stucky101
Engager

So I tried this and yes, the range calculates well. However, I realized that this produces a value that is specific to this cluster only. It cannot be compared to the ranges of other clusters as they have a completely different number of hits within the same timeframe.  F.e.

A cluster that is 9:1 has a range of 8 (very imbalanced) while a cluster that is 60:40 has a range of 20. This would make it look more imbalanced when the opposite is true. I tried to apply some simple math after but it appears "load" loses its value right after the range calculation for some reason.

stats range(load) as balance by cluster | eval result=(balance/load*100) | table cluster,result

This produces nothing for "result". Yet, when I replace "load" with an arbitrary number it produces a value

stats range(load) as balance by cluster | eval result=(balance/30*100) | table cluster,result

 

In fact, I can see that "load" loses its value right after doing the range calc on it.

stats range(load) as balance by cluster | table cluster,load

 

I presume the stats range function does that but how do I get these ranges turned into percentages for cross cluster comparison ?

 

Thx

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try something like this

| stats count as load by host
| eval cluster=mvindex(split(host,"-"),0)
| eventstats sum(load) as total by cluster
| stats range(load) as imbalance values(total) as total by cluster
| eval imbalance=round(100*imbalance/total,2)
0 Karma

stucky101
Engager

You da man ! This deserves an upvote ! Thank you very much ! You saved my day.

I do have a question. I had almost figured out something similar and it looked great with ONE major problem.

I had this:

| stats count as load by host | eval cluster=mvindex(split(host,"-"),0) | stats first(load) as load range(load) as range by cluster | eval result=round((range/load)*100) | table cluster,result

Worked properly on all but a single cluster. On there I got percentage values over 100 :(.

I know the problem with stats is that it only returns the values it has created. I saw a post about that where someone suggested the trick using "first" to preserve the load value. I tried that but it turns out it only works when the cluster is imbalanced one way (n01 has more load than n02). I have this one cluster where that is reversed and the stats produce a reverse percentage here. Would there have been a way to make my query work for both cases as well ? Just curious.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Instead of first(load), you could have used max(load)

0 Karma

stucky101
Engager

Hmmm so these 2 queries do not produce the same results. Similar enough to get fooled initially, but not the same. When I check the top 10 with this:

| stats count as load by host | eval cluster=mvindex(split(host,"-"),0) | stats max(load) as load range(load) as range by cluster | eval imbalance=round((range/load)*100) | table cluster,imbalance | sort imbalance desc | head 10

I get:

lon03 100
gga 98
sms 97
tyoa 93
au01 92
rb01 90
swldmz 89
par 86
bp01 85
sf01 84

When I use this query:

| stats count as load by host | eval cluster=mvindex(split(host,"-"),0) | eventstats sum(load) as total by cluster | stats range(load) as imbalance values(total) as total by cluster | eval imbalance=round(100*imbalance/total) | table cluster,imbalance | sort imbalance desc | head 10

I get:

lon03 100
gga 96
sms 94
tyoa 86
au01 85
rb01 82
swldmz 80
par 76
bp01 74
sf01 72

 

Some with a 10point difference.

The percentage math is the same and we're both rounding. I wonder if both these queries can be considered equal. If so, should they not produce the same (or near same) numbers over the same timespan ?

thx

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

They are not the same calculation - one calculates the imbalance based on the difference in load as a percentage of the maximum load, whereas the other calculates the imbalance based on the difference in load as a percentage of the total load. Both can be considered as measures of imbalance but are equal at only two points, complete balance and complete imbalance.

Have a look at this runanywhere example to see the difference between the two ways of calculating the imbalance.

| makeresults count=51
| fields - _time
| streamstats count as row
| eval loadA=row - 1
| eval loadB=100 - loadA
| eval range=abs(loadA - loadB)
| eval imbalanceA=round(100*range/max(loadA, loadB),2)
| eval imbalanceB=round(100*range/(loadA + loadB),2)
| table loadA loadB range imbalanceA imbalanceB

So, it comes down to what is it you are trying to show as your measure of imbalance.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...