Splunk Search

Is anyone aware of the availability of the geometric mean stats in Splunk?

hvandenb
Path Finder

Is anyone aware of the availability of the geometric mean stats in Splunk?

Tags (3)
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

At the current moment, as per this documentation, there is not an already implemented geometric mean in Splunk.


However, to get the geometric mean of our_field one could do:

... | eval natural_logs = ln(our_field)
| stats mean(natural_logs) as log_mean
| eval geometric_mean = exp(log_mean)

explanation:

If we think for a moment about what the geometric mean really is, that being the nth root of the product of n numbers:
alt text

we could express this in terms of logarithms, since multiplication becomes a sum and the power becomes multiplication:
alt text

Wikipedia:

The right-hand side formula above is generally the preferred alternative for implementation in computer languages. This is because calculating the product of many numbers can lead to an arithmetic overflow or arithmetic underflow. This is less likely to occur when you first take the logarithm of each number and sum these.

So in Splunk, if we work backwards, we can hypothetically

1.) Take natural log with eval function ln()
2.) stats mean
3.) Take the exponential function with eval function exp()

View solution in original post

aljohnson_splun
Splunk Employee
Splunk Employee

A second approach would be to use the R app for Splunk.

1.) Download the app
2.) Add the path to your R bin in $SPLUNK_HOME/etc/apps/r/default/r.conf e.g. r=/usr/bin/R
3.) Pipe to R in your search command like this:

| table some_field
| r "exp(mean(log(data.matrix(input)))) -> output"

Here is a slightly more complicated example:

sourcetype=ps earliest=-4m
| multikv fields RSZ_KB
| search RSZ_KB > 0 AND VSZ_KB > 0
| table RSZ_KB VSZ_KB
| r "
gm_mean = function(x, na.rm=TRUE){
  exp(sum(log(x[x > 0]), na.rm=na.rm) / length(x))
}
data <- data.matrix(input);
output <- apply(data, 2, gm_mean)"

provides

x
132.902175678696
34188.4285350717

hvandenb
Path Finder

Thanks for this as well, for more and more stats capabilities we'll be using R as well, so thanks for pointing this out as well.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

At the current moment, as per this documentation, there is not an already implemented geometric mean in Splunk.


However, to get the geometric mean of our_field one could do:

... | eval natural_logs = ln(our_field)
| stats mean(natural_logs) as log_mean
| eval geometric_mean = exp(log_mean)

explanation:

If we think for a moment about what the geometric mean really is, that being the nth root of the product of n numbers:
alt text

we could express this in terms of logarithms, since multiplication becomes a sum and the power becomes multiplication:
alt text

Wikipedia:

The right-hand side formula above is generally the preferred alternative for implementation in computer languages. This is because calculating the product of many numbers can lead to an arithmetic overflow or arithmetic underflow. This is less likely to occur when you first take the logarithm of each number and sum these.

So in Splunk, if we work backwards, we can hypothetically

1.) Take natural log with eval function ln()
2.) stats mean
3.) Take the exponential function with eval function exp()

aljohnson_splun
Splunk Employee
Splunk Employee

Please correct me if I'm wrong too, I was just trying to brainstorm a way...

0 Karma

hvandenb
Path Finder

That's really good, it would be good to have this implemented. Geometric mean is good to have comparing ratios. Thanks for the help I'll submit a request.

Get Updates on the Splunk Community!

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had 3 releases of new security content via the Enterprise Security ...

Archived Metrics Now Available for APAC and EMEA realms

We’re excited to announce the launch of Archived Metrics in Splunk Infrastructure Monitoring for our customers ...