Splunk Search

How to append a total row for a column chart?

byu168168
Path Finder

Hi, so I currently have a column chart that has two bars for each day of the week, one bar is reanalysis and one is resubmission. I want to add a third column for each day that does an average across both items but I can't find a good way to add another field entry.

I can shape the data to get the Sum/count separately but I'm not sure how to add a total

Analysis_type        Date        Sum                Count
reanalysis         06/12       50        2  
resubmission         06/12     400      4
reanalysis         06/13       55        5  
resubmission         06/13   500        10

Should look like this in final iteration

    Analysis_type        Date       Average          
    reanalysis         06/12      25
    resubmission         06/12    100   
    Total               06/12      75 ((50 + 400)/(2+4))
    reanalysis         06/13      11
    resubmission         06/13    50
    Total               06/13      37 ((55+500)/(10+5)

Let me know if more information is needed to handle this issue.

Here is the query I am using to generate the first chart.

index=cumulus_test1 AND (analysis_type="reanalysis" OR analysis_type="resubmission")
| dedup file_name
| eval SizeGB = ubf_size/1000000000 
| eval month_num=strftime(_time,"%m")
| eval day_num=strftime(_time,"%d")
| eval date=month_num."/".day_num
| stats sum(SizeGB) AS Sum, count(file_name) AS count by analysis_type, date
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this
** (Updated per comment)**

index=cumulus_test1 AND (analysis_type="reanalysis" OR analysis_type="resubmission")
 | dedup file_name
 | eval SizeGB = ubf_size/1000000000 
 | eval month_num=strftime(_time,"%m")
 | eval day_num=strftime(_time,"%d")
 | eval date=month_num."/".day_num
 | stats sum(SizeGB) AS Sum, count(file_name) AS count by analysis_type, date    
| appendpipe [| stats sum(Sum) as Sum sum(count) count by date | eval analysis_type="Total" ]
| eval Average=Sum/count | fields - Sum count
| sort 0 analysis_type date

View solution in original post

0 Karma

somesoni2
Revered Legend

Try like this
** (Updated per comment)**

index=cumulus_test1 AND (analysis_type="reanalysis" OR analysis_type="resubmission")
 | dedup file_name
 | eval SizeGB = ubf_size/1000000000 
 | eval month_num=strftime(_time,"%m")
 | eval day_num=strftime(_time,"%d")
 | eval date=month_num."/".day_num
 | stats sum(SizeGB) AS Sum, count(file_name) AS count by analysis_type, date    
| appendpipe [| stats sum(Sum) as Sum sum(count) count by date | eval analysis_type="Total" ]
| eval Average=Sum/count | fields - Sum count
| sort 0 analysis_type date
0 Karma

byu168168
Path Finder

Thanks! That worked perfectly!

0 Karma

JDukeSplunk
Builder

Without reading too much, have you tried putting

 | addcoltotals 

At the end?

0 Karma

byu168168
Path Finder

I attempted that however that gives me the totals across the entire date range when I want the totals to be done by date.

0 Karma

woodcock
Esteemed Legend

Just add this to the end of your existing search:

| eval Average = Sum / Count | fields - Sum Count
0 Karma

byu168168
Path Finder

That gives me the average for both reanalysis and resubmission but I'm trying to get an aggregated average across both. I updated the post with a bit more detail

0 Karma

byu168168
Path Finder

There's no need to be so snarky considering this is a place where people ask for help...

Here's the event flow:
I have single events that have a ubf_size and one of two different analysis types. I want an average by analysis types and I also want an aggregated average across both analysis types for each day. I'm not sure what's unclear about that. I simplified it so you can see exactly what I mean, right now I have the first two rows but I want to produce a third total row that sums up the Sum(ubf_size) and count(files) from reanalysis and resubmission in order to produce an aggregated average while retaining the data regarding reanalysis and resubmission.

The data looks like this
Analysis Type Date Sum(ubf_size) count(files) Average
reanalysis 06/12 10 5 2
resubmission 06/12 12 3 4

I want to add a row like this
total 06/12 22 8 2.75
The answer you gave me gives me an average for both reanalysis and resubmission but there is no "total"

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Can you share your query?

It should look something like ... | stats values(Date), avg(Average) by Analysis_type

0 Karma

byu168168
Path Finder

Updated the post with my query

0 Karma

adonio
Ultra Champion

how did you calculate the averages on the second results?
maybe `... | eval Average = round(Sum/Count, 2)
hope it helps

0 Karma

byu168168
Path Finder

My issue isn't calculating the average but taking an aggregated average of reanalysis and resubmission values and generating another entry for each date under a Total

0 Karma

adonio
Ultra Champion

can you please elaborate a little more?
sorry i cant understand exactly what you are looking for but in your example, the math does not work considering average = sum / count, your results do not reflect that.
maybe you refer to the | addcoltotal command?
it is tricky when adding averages, or do you want to add averages?
please share some more info so we can better assist

0 Karma

byu168168
Path Finder

Sorry, I was unclear and used filler numbers to get the point across. I've fixed it so the numbers all make sense

Each day I produce files with a ubf_size. These files fall into two categories reanalysis and resubmission. I want an average of the ubf_sizes for each category on a per day basis. I also want an aggregated average that combines the two categories.

Taking the example from 06/12 above
Reanalysis had 2 files for a total ubf_size of 50 GB giving an average of 25
Resubmission had 4 files for a total ubf_size of 400 GB giving an average of 100

I have no trouble getting the above two rows but I want to somehow append another row that would look like
Total 6 files for a total ubf_size of 450 giving an average of 75.
While retaining the data for reanalysis and resubmission and doing it on a per day basis.

| addcoltotal doesn't work because it does it across the entire dataset when I want the totals per day. I don't want to add averages because that number would be incorrect which is why I calculated the Sum/Count separately to manually calculate the averages instead of using stats avg(ubf_size). I'm simply unsure how to manipulate the Sum/Count to get a separate total.

0 Karma

adonio
Ultra Champion

add to your search:
sum(ubf_size) as total_gb count as total_count
| eval avg = total_gb / total_count

0 Karma

byu168168
Path Finder

Yes, that would successfully give me the total I am looking for but then I lose the statistics regarding reanalysis/resubmission. My final goal is a column chart where the x-axis is the date, y-axis is the average and each date has three columns (reanalysis, resubmission, total).

0 Karma

woodcock
Esteemed Legend

This question is so borked that maybe you should start over. I have NO IDEA what you are trying to do at this point. The path from your events to your "final iteration" is infinitely cloudy. I have no ideal what math/logic you are trying to apply.

0 Karma

byu168168
Path Finder

See my response to Adonio. I fixed up the numbers in the example to make sense.

As it is currently, I can generate the averages for both reanalysis and resubmission just fine. If that was it I could use stats avg(ubf_size) to achieve that.

The issue is that I want an aggregated average which combines the totals of both on a per day basis while retaining the reanalysis and resubmission statistics. To me the best method seems to be calculating the Sum/Count separately then somehow appending the summation on a per day basis to a new analysis_type called "Total" where the

average=Sum(reanalysis+resubmission ubf_size)/Count(reanalysis+resubmission file count).

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...