Splunk Search

Using Transaction on one to many relationship data

Engager

I have following data from a log file and I am trying to create a Report on Splunk site.

2014-07-08 16:09:55.966473 | orderID | b6ef3555 | status= "TRADE", seq="34521", out_time ="34527"
2014-07-08 16:09:55.966473 | orderID | b6ef3555 | status= "TRADE", seq="34522", out_time ="34528"

2014-07-08 16:09:55.966473 | orderID | b6ef3559 | status= "NEW", seq="34522", out_time="34533"

2014-07-08 16:09:55.966473 | orderID | b6ef3560 | status= "NEW", seq="34522", out_time="34535"

2014-07-08 16:09:55.966473 | orderID | b6ef3562 | status= "NEW", seq="34523", out_time="34537"

2014-07-08 16:09:55.966473 | orderID | b6ef3555 | status= "TRADE", seq="34552", out_time ="34540"

2014-07-08 16:09:55.966473 | orderID | b6ef3561 | status= "TRADE", seq="34554", out_time="34559"

2014-07-08 16:09:55.966473 | orderID | b6ef3565 | status= "NEW", seq="34554", out_time="34568"

2014-07-08 16:09:55.966473 | orderID | b6ef3567 | status= "NEW", seq="34555", out_time="34570"

I want to calculate the out_time difference for only those status='NEW" message which can be matched with Trade based on seq field

My output should look like :

OrderID,out_time_diff,out_time_New,out_time_Trade

b6ef3559,5,34533,34528

b6ef3560,7,34535,34528

b6ef3565,14,34568,4554

I can match the Trade message to New message based on "seq" field but since I can have one to many relationships between trade to new message, I am unable to calculate the differences using "transaction seq" command.
Also plz note that I might have some extra trade and extra new messages which I do not want to take into consideration.
What are the other options apart from using transaction ?
Or is there a way I can use transaction so as to reuse the data from one line to multiple other lines ?

Tags (1)
0 Karma

Motivator

NOTE: Edited search to fix typos and add a where condition. I didn't have access to a splunk instance to test the search out before when I posted it. You might need to adjust field names and/or capitalization to match your instance (for example: orderID vs OrderID in the eval statement on line 4 of this search.

Another approach:

... your base search here ... 
eval Base_Time=if(status=="TRADE",out_time,null()) 
| eval New_Time=if(status=="NEW",out_time,null()) 
| eval New_Time_and_OrderID=mvzip(New_Time,orderID) 
| stats values(New_Time_and_OrderID) as New_Time_and_OrderID values(Base_Time) as Base_Time by seq 
| mvexpand New_Time_and_OrderID 
| rex field=New_Time_and_OrderID "(?<New_Time>[^,]+),(?<OrderID>[^,]+)" 
| eval out_time_diff=New_Time-Base_Time 
| where isnotnull(Base_Time) 
| table OrderID out_time_diff Base_Time New_Time

This should give you the results you are looking for. When I run this on the sample data you provided, it gives the following results:

OrderID        out_time_diff     Base_Time      New_Time
b6ef3559                   5         34528         34533
b6ef3560                   7         34528         34535
b6ef3565                   9         34559         34568

OrderID's "b6ef3559" and "b6ef3560" are both "NEW" records, and both have the same sequence number (34522) corresponding to the sequence number of a single "TRADE" record. How does this work for you?

0 Karma

Motivator

I've updated my original answer to address your comment, fix typos, and fix my regex. Please try it now and see if it works for you.

0 Karma

Engager

this doesnt work when I have one "trade" and 2 "new" data
Also I might have more new and trade messages which wont match up based on "seq" field, I only want to display the NEW and TRADE data which matches based on "seq" field

0 Karma

SplunkTrust
SplunkTrust

You should be able to use something along these lines:

base search | eventstats first(eval(case(status=="TRADE",out_time))) as out_time_trade by seq | search status="NEW" | eval out_time_diff = out_time - out_time_trade | table OrderID out_time_diff out_time out_time_trade

The eventstats looks for the TRADE event for each value of seq and copies its out_time into the NEW events.

0 Karma

SplunkTrust
SplunkTrust

You can filter for that like this at the start of the second line:

... | search status="NEW" out_time_trade=* | ...

That way a NEW event will only be considered if there is a matching TRADE event for that seq value.

0 Karma

Engager

Actually I might have more new and trade messages which wont match up based on "seq" field, I only want to display the NEW and TRADE data which matches based on "seq" field

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!