Splunk Enterprise

Why does stats median does not work with 0 value?

segantinro
Engager

I need to personalize the "Data Processing Queues" monitored made by Monitoring Console.

I found that "median" aggregate function, on stats or timechart commands does not work correctly.

Indeed, launching the following search, over "all time" on  my PC (host=localhost), I obtain that median is 0 if on values there is a 0.

In the example attached, the correct median is 0.73, instead Splunk calculate 0.

 

(group=queue host=localhost index=_internal name=* source=*metrics.log sourcetype=splunkd)
| eval ingest_pipe=if(isnotnull(ingest_pipe),ingest_pipe,"none")
| search ingest_pipe=*
| where match(name,"agg")
| eval max=if(isnotnull(max_size_kb),max_size_kb,max_size), curr=if(isnotnull(current_size_kb),current_size_kb,current_size), fill_perc=round(((curr / max) * 100),2)
| timechart minspan=30s Median(fill_perc) values(fill_perc) avg(fill_perc) useother=false limit=15

 

 

median.png

 

Anyone else found this issue ?

 

Labels (3)
Tags (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

It is quite possible that this is correct, given that your avg is 0.11 and you have quite a few values above that so there must be quite a few values below that to drive the mean down to 0.11

Try listing all the values to see if the median is right

| timechart minspan=30s Median(fill_perc) list(fill_perc) avg(fill_perc) useother=false limit=15

You could also try counting them

View solution in original post

segantinro
Engager

For median calculation, I considered only disinct values and not all values!

This was a wrong way to calculate

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It is quite possible that this is correct, given that your avg is 0.11 and you have quite a few values above that so there must be quite a few values below that to drive the mean down to 0.11

Try listing all the values to see if the median is right

| timechart minspan=30s Median(fill_perc) list(fill_perc) avg(fill_perc) useother=false limit=15

You could also try counting them

johnhuang
Motivator

I concur with @ITWhisperer. Use list and you'll see all the zeros in your data set. 

0 Karma

FelixLeh
Contributor

your previous search until time chart command
| timechart minspan=30s values(fill_perc) as values_fill_perc avg(fill_perc)

| eventstats median(values_fill_perc) by _time
| rename values_fill_perc as "values(fill_perc)"

The eventstats command uses the multivalue field creates by the values() aggregate function and adds a new column to the table.

Warning: This will give you the Median of existent values and not the actual median over all events in the database.  (see comment from @ITWhisperer )

_______________________________________

If this was helpful please consider awarding Karma. Thx!

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...