Hi, I have the following data with the following columns, OrderNo, Transaction Start, Transaction Stop. I wrote a search by OrderNo to get the time difference for each order. The problem is that Order Number 333 below has multiple transactions and I need to calculate based on every 2 lines of data based on OrderNo.
It works fine until I get to Orders that have multiple transactions.
index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S")
| stats earliest(start_time) as start_time earliest(stop_time) as stop_time by OrderNo, Service
| eval duration=tostring(stop_time-start_time)
| stats mean(duration) as avg_duration by Service
| table Service, avg_duration
Is it possible to read through one OrderNo to split it up into several transactions. It's obvious I shouldn't be using earliest but I just realized some of the orders have multiple transactions and after searching and coming up empty I ended up here.
Thanks!
Like this:
index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S")
| streamstats last(stop_time) as stop_time by OrderNo, Service
| search start_time="*"
| eval duration=tostring(stop_time-start_time)
| eventstats mean(duration) as avg_duration by Service
| table Service, avg_duration
can you try using streamstats
and filldown
to get them into one line?
index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S")
| filldown transaction_start
| streamstats count by OrderNo transaction_start
| search count=2
| eval duration=tostring(stop_time-start_time)
| stats mean(duration) as avg_duration by Service
| table Service, avg_duration
If you can make the assumption that the multiple transactions will never be overlapping, or interleaving, then you can use streamstats to make yourself an additional field that can supply the extra distinctness. Note the streamstats I've added in line 4, and the extra "transaction_count" field I added to the group by clause in the stats command.
index=myindex source=mysource Service=myservice OrderNo=*
| eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S")
| streamstats dc(start_time) as transaction_count by OrderNo
| stats earliest(start_time) as start_time earliest(stop_time) as stop_time by OrderNo, transaction_count, Service
| eval duration=tostring(stop_time-start_time)
| stats mean(duration) as avg_duration by Service
| table Service, avg_duration
In the above, the | streamstats dc(start_time) as transaction_count by OrderNo
command sneaks in just before the main stats, and it will paint a little integer onto each transaction. For the simple ones they'll each just get a "1" for that integer. But the ones with the multiple transaction will get a different integer for each transactoin.
Of course, if the multiple transactions for a given OrderNo can get interleaved, then this will become a mess. (You could even then craft an explicit search to test whether that assumption is true, possibly even run that second search for a while as an alert if you're paranoid about it. )
Like this:
index=myindex source=mysource Service=myservice OrderNo=*
|eval start_time = strrptime(transaction_start, "%Y-%m-%d %H:%M:%S")
| eval stop_time = strptime(transaction_stop, "%Y-%m-%d %H:%M:%S")
| streamstats last(stop_time) as stop_time by OrderNo, Service
| search start_time="*"
| eval duration=tostring(stop_time-start_time)
| eventstats mean(duration) as avg_duration by Service
| table Service, avg_duration
Thank you that worked!
Note that for your OrderNo values that have multiple transactions, this answer is going to calculate a single duration that is from the start of the Order's earliest transaction to the end of the latest transaction. it will factor that single large duration into the later average, rather than factoring in the individual transaction durations. But if that is OK or even desired, then indeed this approach is simpler.