Dashboards & Visualizations

Compare today vs average of past with a sliding percentage scale

dstuder
Communicator

I have a dashboard that shows a bunch of different metrics for some data that I have. One of the metrics compares todays counts vs an average of the last four weeks on the same weekday and up to the same time of the day. So in essence my data looks kind of like this.

 

type     avg_past         today
----------------------------------
foo      10456            10550
bar      6                9
baz      20               30
etc...

 

I've got this charting to a bar graph where I can see for each type the past average vs today. What I would like to do is only show the ones where there is a statistically significant difference between the past and today. I could throw on something like this  to my search ...

 

| where today > (avg_past * 1.25)

 

And that will work ok for types that have lots of data, but for instance in my example above "bar" has 50% more data. So, it would also show up, but it's really not statistically significant. So, my predicament is that I need the percentage different to be larger the smaller the counts and smaller as the counts go up. Thoughts on how to achieve this? Thanks.

Labels (1)
1 Solution

PickleRick
SplunkTrust
SplunkTrust

OK. Let's approach it from the maths side.

Your condition

today>avg*1.25

can be expressed as

today/avg>1.25

(Yes, I see the avg in denominator but I don't care for now)

And you want this ratio not to be fixed but to be more flexible proportionally to the avg value.

So you can adjust it to be

today/avg>1.25*f(avg)

Now you only need the f(avg) to be "properly" changing the value so that it is relatively big for small avg values and trends towards 1 for bigger values. For example

f(avg)=1+1/avg

If your average is only limited to natural numbers you'll only get as high as 2, so you might scale it out a bit. If your average can go all the way to 0, you could adjust it a bit because if avg trends towards 0, 1/avg goes up to infinity so that's probably not what we want.

So you might fiddle with offsetting the arguments here and there and you'll end up with something like

| where today>1.25*avg*(avg+a)/(avg+b)+c

Now you'll only need to find a, b and c to suit your needs.

Draw some examples on https://www.desmos.com/calculator and find your parameters 🙂

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

Well, we can't tell you what you want from your data 😉

But you could - for example - add a "safeguard" for the absolute values so that you don't mind a relatively "big change" as long as the values are low

| where today > (avg_past * 1.25) AND today > 100

That's one example.

Otherwise you could simply "adjust" the factor by the ratio of today (or avg) vs. some predefined threshold effectively making the comparison non-linear. Just watch out for the upper range because it could get squeezed so much that you'd get "alerted" on any change.

It's really just a question what _you_ want.

 

0 Karma

dstuder
Communicator

"Well, we can't tell you what you want from your data"

What did I miss in my example that would help with understanding my question?


"It's really just a question what _you_ want."

Specifically, I'm wanting to find out how to change the percentage based on past average counts. As it is in my example I'm saying show me everything where today is 25% larger than the past average. But for small amounts that's not helpful. It's really a math thing. I could probably do an "if/then" on the average past count and use a different percentage for different ranges, but that seems inelegant to me. I'm sure there is a mathematical way to slide the percentage based on counts ... but that math is a bit beyond me.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. Let's approach it from the maths side.

Your condition

today>avg*1.25

can be expressed as

today/avg>1.25

(Yes, I see the avg in denominator but I don't care for now)

And you want this ratio not to be fixed but to be more flexible proportionally to the avg value.

So you can adjust it to be

today/avg>1.25*f(avg)

Now you only need the f(avg) to be "properly" changing the value so that it is relatively big for small avg values and trends towards 1 for bigger values. For example

f(avg)=1+1/avg

If your average is only limited to natural numbers you'll only get as high as 2, so you might scale it out a bit. If your average can go all the way to 0, you could adjust it a bit because if avg trends towards 0, 1/avg goes up to infinity so that's probably not what we want.

So you might fiddle with offsetting the arguments here and there and you'll end up with something like

| where today>1.25*avg*(avg+a)/(avg+b)+c

Now you'll only need to find a, b and c to suit your needs.

Draw some examples on https://www.desmos.com/calculator and find your parameters 🙂

dstuder
Communicator

I followed your post until it went from 

1+1/avg

to

(avg+a)/(avg+b)+c

I'm not totally sure what a, b, and c represent, but I ended up going with this after looking at the graph and feeling like the curve worked pretty good with what I was trying to accomplish.

(avg+20)/(avg+1)+1

Anyway, thanks a ton. This is exactly what I was hoping to accomplish. It seems to be working pretty well. I tip my hat to your math skills. There was a day when I could have done that but if you don't use it you lose it.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...