Ignore empty data point when calculating average

ermosk · ‎05-07-2015

I am trying to calculate the average for a few columns and rows but I have came across the following issue. Some rows might not have data for a particular column, but it seems these are still taken into account when calculating the avg for this column which means the data is way off. I'd like my formula to ignore a data point (if not present) when calculating the average for the column.

Here is my formula :

table ABC* | stats avg(ABC_*)

So if I have something like :

Column : ABC_1 | ABC_2 | ABC_3 | ABC_4 | ABC_5
          1        2       1        2      2  
          2                2               1             
          1                2        3

The average for column ABC_2, ABC_4 and ABC_5 is incorrect.

asturt · ‎09-26-2018

Did you ever find a solution for this? I was doing what woodcock suggested in his reply to your question, but this rounds to a lower precision than the avg function.

stephanefotso · ‎05-07-2015

I'm not sure that the problem is the avg function. Because the avg function returns the average of the values of field X. Means the null value of the field X is not taking into account by avg(X). In another words, in your table, avg(ABC_2) must return 2.
Check the problem elsewhere

SGF

ermosk · ‎05-07-2015

It's quite simple really. Looking at the table I can clearly see what the result should be. Just adding the avg function causes it go off.

srinathd · ‎05-07-2015

Use fillnull to provide values for NULL and try.

ermosk · ‎05-07-2015

Using fillnull value=NULL seems to ignore any columns that have NULL in them.

woodcock · ‎05-07-2015

Try this:

table ABC* | stats count(ABC*) AS countABC*, sum(ABC*) AS sumABC* | eval avg*=sumABC*/countABC*

ermosk · ‎05-07-2015

I like this approach but I can't get it to work. I get the following error:

Error in 'eval' command: The expression is malformed. An unexpected character is reached at '/count_*'.

Here is how my search currently looks

Some initial event filtering | table time_svc_agg* | stats count(time_svc_agg_) AS count_, sum(time_svc_agg_) AS sum_ | eval avg_=sum_/count_*

woodcock · ‎05-07-2015

It looks like you will have to iterate each variable as you need it in the "eval" part like this:

| table time_svc_agg* | stats count(time_svc_agg_*) AS count_*, sum(time_svc_agg_*) AS sum_* | eval avg_1=sum_1/count_1 | eval avg_2=sum_2/count_2 ...

ermosk · ‎05-07-2015

So this introduces a new issue because I don't know beforehand how many of these time_svc_agg will exist, nor their names. Will I need a subquery to take care of this?

ermosk · ‎05-07-2015

I encapsulated them in $$ (like eval avg_=$sum_$/$count_*$ ) but I still cant see any new columns with avg_ in my table

Ignore empty data point when calculating average

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Event Series: Telemetry Pipeline Management

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

Join the Conversation

Ignore empty data point when calculating average

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Event Series: Telemetry Pipeline Management

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition