Splunk Enterprise

Why does stats median does not work with 0 value?

segantinro
Engager

I need to personalize the "Data Processing Queues" monitored made by Monitoring Console.

I found that "median" aggregate function, on stats or timechart commands does not work correctly.

Indeed, launching the following search, over "all time" on  my PC (host=localhost), I obtain that median is 0 if on values there is a 0.

In the example attached, the correct median is 0.73, instead Splunk calculate 0.

 

(group=queue host=localhost index=_internal name=* source=*metrics.log sourcetype=splunkd)
| eval ingest_pipe=if(isnotnull(ingest_pipe),ingest_pipe,"none")
| search ingest_pipe=*
| where match(name,"agg")
| eval max=if(isnotnull(max_size_kb),max_size_kb,max_size), curr=if(isnotnull(current_size_kb),current_size_kb,current_size), fill_perc=round(((curr / max) * 100),2)
| timechart minspan=30s Median(fill_perc) values(fill_perc) avg(fill_perc) useother=false limit=15

 

 

median.png

 

Anyone else found this issue ?

 

Labels (3)
Tags (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

It is quite possible that this is correct, given that your avg is 0.11 and you have quite a few values above that so there must be quite a few values below that to drive the mean down to 0.11

Try listing all the values to see if the median is right

| timechart minspan=30s Median(fill_perc) list(fill_perc) avg(fill_perc) useother=false limit=15

You could also try counting them

View solution in original post

segantinro
Engager

For median calculation, I considered only disinct values and not all values!

This was a wrong way to calculate

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It is quite possible that this is correct, given that your avg is 0.11 and you have quite a few values above that so there must be quite a few values below that to drive the mean down to 0.11

Try listing all the values to see if the median is right

| timechart minspan=30s Median(fill_perc) list(fill_perc) avg(fill_perc) useother=false limit=15

You could also try counting them

johnhuang
Motivator

I concur with @ITWhisperer. Use list and you'll see all the zeros in your data set. 

0 Karma

FelixLeh
Communicator

your previous search until time chart command
| timechart minspan=30s values(fill_perc) as values_fill_perc avg(fill_perc)

| eventstats median(values_fill_perc) by _time
| rename values_fill_perc as "values(fill_perc)"

The eventstats command uses the multivalue field creates by the values() aggregate function and adds a new column to the table.

Warning: This will give you the Median of existent values and not the actual median over all events in the database.  (see comment from @ITWhisperer )

_______________________________________

If this was helpful please consider awarding Karma. Thx!

0 Karma
Get Updates on the Splunk Community!

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...

Observability Highlights | January 2023 Newsletter

 January 2023New Product Releases Splunk Network Explorer for Infrastructure MonitoringSplunk unveils Network ...

Security Highlights | January 2023 Newsletter

January 2023 Splunk Security Essentials (SSE) 3.7.0 ReleaseThe free Splunk Security Essentials (SSE) 3.7.0 app ...