Splunk Search

Performing Sum Calculation when Field values are combined

limalbert
Path Finder

Captured fields are Account, RequestorCode, Service, and ElapsedTime. An Account will have multiple RequestorCode, and a RequestorCode will have multiple Service. ElapsedTime format is "d.ddms".

  1. I'm trying to combine data that have Service=FireTV and Service=Echo with Account and RequestorCode must be the field with identical data values. When combined, I need help to sum the ElapsedTime between Service=FireTV and Service=Echo. There's only going to be a unique set of Service="FireTV" and Service="Echo" per Account + RequestorCode.
  2. After that, I need help to show the highest ElapsedTime from each data that has the same Account and RequestorCode.

Initial query

index=
| table _time Account RequestorCode Service ElapsedTime 

Sample event log (Possibly populate multivalued fields)

2018-08-16 21:00:00,000 [-dd.dd.dd.ddd:dddd-d] [INFO] [                    ] [info:dd.dd.dd] (info.info.info) INFO   Rei21 - Service=Echo for Account=Rei, SiteID=555 ElapsedTime=20.0ms on RequestorCode=XX1 Channel=info OperatorID=Rei21 Page=info TransactionID=info

Raw sample data

Account: Rei
RequestorCode: XX1
Service: FireTV
ElapsedTime: 80.0ms    

Account : Rei
RequestorCode: XX1
Service: Echo
ElapsedTime: 20.0ms

Account : Rei
RequestorCode: XX1
Service: GoogleHome
ElapsedTime: 50.0ms

Account : Cliff
RequestorCode: XX1
Service: Google Home
ElapsedTime: 20.0ms

Account : Cliff
RequestorCode: XX1
Service: HomePod
ElapsedTime: 50.0ms

Account : Cliff
RequestorCode: XX2
Service: FireTV
ElapsedTime: 70.0ms

Sample data when FireTV and Echo are combined as Amazon

Account : Rei
RequestorCode: XX1
Service: Amazon
ElapsedTime: 100.0ms

Account : Rei
RequestorCode: XX1
Service: GoogleHome
ElapsedTime: 50.0ms

Account : Cliff
RequestorCode: XX1
Service: Google Home
ElapsedTime: 20.0ms

Account : Cliff
RequestorCode: XX1
Service: HomePod
ElapsedTime: 50.0ms

Account : Cliff
RequestorCode: XX2
Service: Amazon
ElapsedTime: 70.0ms

Expected output

Account: Rei
RequestorCode: XX1
Service: Amazon
ElapsedTime: 100.0ms

Account : Cliff
RequestorCode: XX1
Service: HomePod
ElapsedTime: 50.0ms

Account : Cliff
RequestorCode: XX2
Service: Amazon
ElapsedTime: 70.0ms
Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

your base search
| table _time Acount RequestorCode Service ElapsedTime
| eval Service=if(Service="FireTV" OR Service="Echo","Amazon",Service)
| stats sum(ElapsedTime) as ElapsedTime by Acount RequestorCode Service
| eventstats max(ElapsedTime) as max by Account RequestorCode
| where ElapsedTime=max
| table Acount RequestorCode Service ElapsedTime

View solution in original post

0 Karma

somesoni2
Revered Legend

Give this a try

your base search
| table _time Acount RequestorCode Service ElapsedTime
| eval Service=if(Service="FireTV" OR Service="Echo","Amazon",Service)
| stats sum(ElapsedTime) as ElapsedTime by Acount RequestorCode Service
| eventstats max(ElapsedTime) as max by Account RequestorCode
| where ElapsedTime=max
| table Acount RequestorCode Service ElapsedTime
0 Karma

limalbert
Path Finder

This is close. Is it possible to add if statement for sum(ElapsedTime) only when Service=Amazon? Otherwise, keep the original ElapsedTime. This query is adding ElapsedTime for all the Services that's not Service=Amazon.

0 Karma

somesoni2
Revered Legend

So for other services, if there are multiple records, you just want the max? What if there are multiple records for each of Service="FireTV" and Service="Echo", (say there are 2 entries for FireTV and 2 for Echo) in those cases should all those elapsedTime should be summed?

0 Karma

limalbert
Path Finder

You're correct! For other services, there are multiple records, and I want to get the max.

For my case, there's only going to be a unique set of Service="FireTV" and Service="Echo" per Account + RequestorCode. Echo, FireTV, Google Home, and HomePod are used as a mock example.

0 Karma

somesoni2
Revered Legend

Give this a try

your base search
 | table _time Acount RequestorCode Service ElapsedTime
 | eval Service=if(Service="FireTV" OR Service="Echo","Amazon",Service)
 | eventstats max(ElapsedTime) as max sum(eval(if(Service="Amazon",ElapsedTime,0))) as totalAmazon by Account RequestorCode
 | where ElapsedTime=max OR (Service="Amazon")
 | dedup Acount RequestorCode Service 
| eval ElapsedTime=if(totalAmazon>0,totalAmazon,ElapsedTime)
 | table Acount RequestorCode Service ElapsedTime

limalbert
Path Finder

This works!! Thank you!
Why do you use eventstats instead of stats for this problem? @somesoni2

0 Karma

somesoni2
Revered Legend

The eventstats does the aggregation without changing the current resultset (it adds columns to existing results without reducing rows). This way all rows were intact and aggregation was done for rows with Service=Amazon.
Prior to that, stats was doing aggregation for all, not only for Service=Amazon, so multiple values for a Account RequestorCode were getting collapsed into one.

Also, After eventstats, there would be two rows for Service=Amazon (from original Service=FireTV and Service=Echo), so that dedup would remove one of them. The eventstats would've added sum(ElapsedTime) to both rows, so anyone will serve your purpose.

niketn
Legend

@limalbert please do accept the answer if your query is resolved. Also do not forget to up vote the comments that helped! 🙂

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@limalbert your sample data and output in question do not match. Sum does not seem to be based on Account and RequestorCode. So for first example both are always 123 and XX1 for three sample data. So the final ElapsedTime should be 150. Even if you plan to take Service for correlation it will not be as per your expected output. Service Names do not match between Sample Data and Output. Please clarify.

Also what is the requirement for highest elapsed time. Please explain with Sample Data again. If you do sum by Account and RequestorCode, then highest by same fields will not do anything unless you are changing the stats aggregating field in the by clause. If you also want Service to be accounted for this you can add that to by clause in stats.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

limalbert
Path Finder

Question is updated. Thank you for the feedback.

0 Karma

renjith_nair
Legend

@limalbert,

Try,

index=yourindex "other search terms"
|table _time Service Account ElapsedTime RequestorCode
|stats sum(ElapsedTime) as totalElapsedTime,max(ElapsedTime) as highestElapsedTime  by Account,RequestorCode
Happy Splunking!
0 Karma

limalbert
Path Finder

Unfortunately, this won’t work for my case. If the data is simpler, this will work. But, there will be multiple data based on Account and RequestorCode.

The elapsedTime from the service that needs to be combined has to be added together first. The, populate the highest elapsedTime based on the account and RequestorCode.

0 Karma

renjith_nair
Legend

Ok I move this to the comment section but do yo have sample data which reflects the scenario better?

Happy Splunking!
0 Karma

limalbert
Path Finder

Sample data is updated in the question.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...