Splunk Search

Calculating data with multiple transactions per order

timm747747
Path Finder

Hi, I have the following data with the following columns, OrderNo, Transaction Start, Transaction Stop. I wrote a search by OrderNo to get the time difference for each order. The problem is that Order Number 333 below has multiple transactions and I need to calculate based on every 2 lines of data based on OrderNo.

alt text

It works fine until I get to Orders that have multiple transactions.

index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S")
| stats earliest(start_time) as start_time earliest(stop_time) as stop_time by OrderNo, Service
| eval duration=tostring(stop_time-start_time)
| stats mean(duration) as avg_duration by Service
| table Service, avg_duration

Is it possible to read through one OrderNo to split it up into several transactions. It's obvious I shouldn't be using earliest but I just realized some of the orders have multiple transactions and after searching and coming up empty I ended up here.

Thanks!

Tags (2)
1 Solution

woodcock
Esteemed Legend

Like this:

index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S") 
| streamstats last(stop_time) as stop_time by OrderNo, Service
| search start_time="*"
| eval duration=tostring(stop_time-start_time) 
| eventstats mean(duration) as avg_duration by Service 
| table Service, avg_duration

View solution in original post

cmerriman
Super Champion

can you try using streamstats and filldown to get them into one line?

index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S") 
| filldown transaction_start
| streamstats count by OrderNo transaction_start
| search count=2
| eval duration=tostring(stop_time-start_time) 
| stats mean(duration) as avg_duration by Service 
| table Service, avg_duration
0 Karma

sideview
SplunkTrust
SplunkTrust

If you can make the assumption that the multiple transactions will never be overlapping, or interleaving, then you can use streamstats to make yourself an additional field that can supply the extra distinctness. Note the streamstats I've added in line 4, and the extra "transaction_count" field I added to the group by clause in the stats command.

index=myindex source=mysource Service=myservice OrderNo=*
| eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S") 
| streamstats dc(start_time) as transaction_count by OrderNo
| stats earliest(start_time) as start_time earliest(stop_time) as stop_time by OrderNo, transaction_count, Service 
| eval duration=tostring(stop_time-start_time) 
| stats mean(duration) as avg_duration by Service 
| table Service, avg_duration

In the above, the | streamstats dc(start_time) as transaction_count by OrderNo command sneaks in just before the main stats, and it will paint a little integer onto each transaction. For the simple ones they'll each just get a "1" for that integer. But the ones with the multiple transaction will get a different integer for each transactoin.
Of course, if the multiple transactions for a given OrderNo can get interleaved, then this will become a mess. (You could even then craft an explicit search to test whether that assumption is true, possibly even run that second search for a while as an alert if you're paranoid about it. )

0 Karma

woodcock
Esteemed Legend

Like this:

index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S") 
| streamstats last(stop_time) as stop_time by OrderNo, Service
| search start_time="*"
| eval duration=tostring(stop_time-start_time) 
| eventstats mean(duration) as avg_duration by Service 
| table Service, avg_duration

timm747747
Path Finder

Thank you that worked!

0 Karma

sideview
SplunkTrust
SplunkTrust

Note that for your OrderNo values that have multiple transactions, this answer is going to calculate a single duration that is from the start of the Order's earliest transaction to the end of the latest transaction. it will factor that single large duration into the later average, rather than factoring in the individual transaction durations. But if that is OK or even desired, then indeed this approach is simpler.

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...