Splunk Enterprise

Why does stats median does not work with 0 value?

segantinro
Engager

I need to personalize the "Data Processing Queues" monitored made by Monitoring Console.

I found that "median" aggregate function, on stats or timechart commands does not work correctly.

Indeed, launching the following search, over "all time" on  my PC (host=localhost), I obtain that median is 0 if on values there is a 0.

In the example attached, the correct median is 0.73, instead Splunk calculate 0.

 

(group=queue host=localhost index=_internal name=* source=*metrics.log sourcetype=splunkd)
| eval ingest_pipe=if(isnotnull(ingest_pipe),ingest_pipe,"none")
| search ingest_pipe=*
| where match(name,"agg")
| eval max=if(isnotnull(max_size_kb),max_size_kb,max_size), curr=if(isnotnull(current_size_kb),current_size_kb,current_size), fill_perc=round(((curr / max) * 100),2)
| timechart minspan=30s Median(fill_perc) values(fill_perc) avg(fill_perc) useother=false limit=15

 

 

median.png

 

Anyone else found this issue ?

 

Labels (3)
Tags (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

It is quite possible that this is correct, given that your avg is 0.11 and you have quite a few values above that so there must be quite a few values below that to drive the mean down to 0.11

Try listing all the values to see if the median is right

| timechart minspan=30s Median(fill_perc) list(fill_perc) avg(fill_perc) useother=false limit=15

You could also try counting them

View solution in original post

segantinro
Engager

For median calculation, I considered only disinct values and not all values!

This was a wrong way to calculate

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It is quite possible that this is correct, given that your avg is 0.11 and you have quite a few values above that so there must be quite a few values below that to drive the mean down to 0.11

Try listing all the values to see if the median is right

| timechart minspan=30s Median(fill_perc) list(fill_perc) avg(fill_perc) useother=false limit=15

You could also try counting them

johnhuang
Motivator

I concur with @ITWhisperer. Use list and you'll see all the zeros in your data set. 

0 Karma

FelixLeh
Contributor

your previous search until time chart command
| timechart minspan=30s values(fill_perc) as values_fill_perc avg(fill_perc)

| eventstats median(values_fill_perc) by _time
| rename values_fill_perc as "values(fill_perc)"

The eventstats command uses the multivalue field creates by the values() aggregate function and adds a new column to the table.

Warning: This will give you the Median of existent values and not the actual median over all events in the database.  (see comment from @ITWhisperer )

_______________________________________

If this was helpful please consider awarding Karma. Thx!

0 Karma
Get Updates on the Splunk Community!

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

In today’s complex digital landscape, security teams face increasing pressure to protect sprawling data across ...

Your summer travels continue with new course releases

Summer in the Northern hemisphere is in full swing, and is often a time to travel and explore. If your summer ...

From Alert to Resolution: How Splunk Observability Helps SREs Navigate Critical ...

It's 3:17 AM, and your phone buzzes with an urgent alert. Wire transfer processing times have spiked, and ...