Splunk Search

How to calculate median, avg for total results?

badbuda
Loves-to-Learn Lots

hey,

I need to build a report, that contains approx 500 thousand events. the requirement is  that the report will contain three rows -
I need to count if httpStatus is ok or not, and classify each eventId in its propper position. (the requirement is that we will have minimal amount of rows!!! I cant duplicate or have more then 10 rows)

so basically the report looks like this:

badbuda_0-1679503456831.png

I have uri column that contains all of my desired info, and all of my calculations of median, avg, precentage etc, are based on the time field as follows:

 

 

|*MY SEARCH *
|stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate 
min(date) as minDate values(timeTaken.total) as time by status 

|table  uri totalCount prec95 prec5 med average status maxDate minDate time

 

 

 

now my question is-

I need to add a new line of totals, based on the other lines.

beacuse Im using functions such as avg, median etc, I dont think I can use |addtotals

and a very important note is that all of my values in the columns time and uri are not distinct. that means they can appear more then once, and then my calculations are wrong, and I cant base a following stats based on the previous one. Ive tried using list, but it has a limit of 100 values, and I have  hundred of thousands. 

what can I do to add another total row that will calculate all of my events ?

Ive tried adding |appendPipe it this way based on the results Ive gotten in the stats command, but of course I got wrong values (because the time result is not distinct, and the values shown in the stats are distinct)

thats my report after adding the total calculation (that didnt work)

badbuda_1-1679504496994.png

 

 

 

|*MY SEARCH *
|stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate 
min(date) as minDate values(timeTaken.total) as time by status 

|appendpipe [stats sum(totalCount) as totalCount  values(uri) as uri values(newTime) as newTime perc95(time) as prec95 perc5(time) as prec5 median(time) as med avg(time) as average| eval status="TOTAL"]

|table  uri totalCount prec95 prec5 med average status maxDate minDate time

 

 

 

I really hope that Ive made my question clear

thank's in advance 🙂

 

 

Labels (1)
0 Karma

badbuda
Loves-to-Learn Lots

I think I found my solution!
I've changed the first stats command to another appendpipe, so it wont ruin the totals-

 

|appendpipe [stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate min(date) as minDate| eval finalStatus="TOTAL"]
|appendpipe [stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate
min(date) as minDate values(timeTaken.total) as time by status | eval finalStatus="ok"]

 

now my results seems to make sense.

thank you all!
 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Since you know the names of all of the fields, there's no reason why you can't use addtotals or (better) addcoltotals.

|*MY SEARCH *
|stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate 
min(date) as minDate values(timeTaken.total) as time by status 

|table  uri totalCount prec95 prec5 med average status maxDate minDate time
|addcoltotals totalCount prec95 prec5 med average 
---
If this reply helps you, Karma would be appreciated.
0 Karma

badbuda
Loves-to-Learn Lots

Thank you for your answer,

But can I add averages (and medians, percentiles etc)?

 

Addcoltotals will just sum these fields together, won't it?

Adding two averages together won't give me the right average I'm looking for

Am I wrong?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You're absolutely right. Adding two averages doesn't give you a "total average" of course.

It seems your original idea was OK, just had small mistakes in the middle.

|*MY SEARCH *
|stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate 
min(date) as minDate values(timeTaken.total) as time by status 

|appendpipe [stats count(request.uri) as totalCount values(uri) as uri values(timeTaken.total) as newTime perc95(timeTaken.total) as prec95 perc5(timeTaken.total) as prec5 median(timeTaken.total) as med avg(timeTaken.total) as average max(date) as maxDate min(date) as minDate values(timeTaken.total) as time| eval status="TOTAL"]

And you don't need table at the end.

0 Karma

badbuda
Loves-to-Learn Lots

thanks, thats exactly what I tried doing at the beginning

but after doing stats for the first time, if Im not saving my fields in stats command, they will not appear in the following query. is there any way of saving previous data before stats command?

I did exactly as you said and these are my results

 

badbuda_0-1679550227238.png

 

Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ahh, right. I don't often use appendpipe so I got confused by the example in the docs. Yes, appendpipe operates at the already aggregated values so it won't be very helpful here.

Unfortunately if you have a parameter which has to be calculated over the whole set you can't calculate it partially 😉 Your solution will skew results a bit because with the first appendpipe you change the overall count of the results. And if I'm not mistaken it will not be that much more effective than eventstat so you could probably use that one instead.

The other solution could be to "manually" calculate the parameters - but that could be tedious.

0 Karma
Get Updates on the Splunk Community!

Notification Email Migration Announcement

The Notification Team is migrating our email service provider from Postmark to AWS Simple Email Service (SES) ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...