Splunk Search

Parsing different time fields for transactions

heorhii12412
Explorer

Hello everyone, I have a bit of a strange requirement, which includes close work with time values. 
I have Splunk events in the following format:

 

 

   event_time: 2023-06-29T14:49:42.787Z
   shipment_status: delivered
   timestamp: 2023-06-29T14:49:51.069Z
   tracking_number:95AAEC4900000

 

 

 Where shipment_status can have different values, but I need only in_transit and delivered, also timestamp field is basically the same as the built-in _time field.

I need to group events based on tracking_number field, and show the percentage of these pairs to the rest of the events, but do that in a way that in_transit  events should have come after delivered events more than an hour after. The catch is - we should consider the difference between _time/timestamp field for in_transit event and event_time field for delivered event

For example, my query:

event_status="delivered" OR event_status="in_transit"
| transaction tracking_number startswith=event_status=delivered endswith=event_status=in_transit keepevicted=true
| where duration > 3600
| stats sum(eventcount) as eventcount by closed_txn
| eventstats sum(eventcount) as totalcount
| where closed_txn == 1``` remaining eventcount only includes complete transactions ```
| eval percentage = eventcount * 100 / totalcount

 uses a duration field, which automatically uses _time field for both events, but I need to get the duration of <_time/timestamp(second_event/in_transit)>  -  <event_time(first_event/delivered)> = 1h15m(for example), which is more than one hour, so this transaction should be included in the eventcount calculation. 

Basically, I need to change this | where duration > 3600 condition in my query for a correct time calculation of both events in the transaction, fitting what I described earlier. 
I still have not found a way to compare fields for these separate events in the transaction, so could someone offer some help? I will be very grateful for any suggestions or solutions!

Labels (4)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Consider altering the _time field of the "delivered" events so the transaction command uses the desired timestamps.

 

event_status="delivered" OR event_status="in_transit"
| eval _time=if(event_status="delivered", strptime(event_time, "%Y-%m-%dT%H:%M:%S.%3N%Z"), _time)
| sort - _time
| transaction tracking_number startswith=event_status=delivered endswith=event_status=in_transit keepevicted=true
| where duration > 3600
| stats sum(eventcount) as eventcount by closed_txn
| eventstats sum(eventcount) as totalcount
| where closed_txn == 1``` remaining eventcount only includes complete transactions ```
| eval percentage = eventcount * 100 / totalcount

 

---
If this reply helps you, Karma would be appreciated.

heorhii12412
Explorer

Good solution! I didn't even know we could intervene in the fields to this extent. But this solution didn't work for me. Splunk gives Error in 'transaction' command: This search requires events to be in descending time order, but the preceding search does not guarantee time-ordered events. 


I don't really understand the problem, if the conditions are not met, then why not ignore this transaction in the final calculation? Probably has something to do with transaction conditions startswith and endswith. But I'm not sure. Do you happen to know a solution?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

When transaction makes that complaint, throw in a sort.  See my updated answer.

---
If this reply helps you, Karma would be appreciated.

heorhii12412
Explorer

@richgalloway I have another query:

 

event_status="delivered" OR event_status="in_transit"
| transaction tracking_number startswith=event_status=delivered endswith=event_status=in_transit keepevicted=true 
| where duration > 3600    
| search closed_txn=1

 

Which returned 5 transactions fit for this condition for the last 2 weeks. Here's one of those(accidentally fits my final requirement):

 

{"timestamp":"2023-07-06T23:44:52.784Z","tracking_number":"C11900891311111","event_status":"delivered","event_time":"2023-07-06T23:44:38.722Z"}
{"timestamp":"2023-07-07T07:03:10.712Z","tracking_number":"C11900891311111","event_status":"in_transit","event_time":"2023-07-03T18:24:36.668Z"}

 

If we take a look at event_time field of the delivered event and assume that it will replace _time field for it, then after comparison of two timestamps in_transit event clearly came 8 hours after the first one. 

For some reason, these 5 transactions are not included in the total calculations in the query that you helped me to compose:

 

event_status="delivered" OR event_status="in_transit"
| eval _time=if(event_status="delivered", strptime(event_time, "%Y-%m-%dT%H:%M:%S.%3N%Z"), _time)
| sort - _time
| transaction tracking_number startswith=event_status=delivered endswith=event_status=in_transit keepevicted=true
| where duration > 3600
| stats sum(eventcount) as eventcount by closed_txn
| eventstats sum(eventcount) as totalcount
| where closed_txn == 1``` remaining eventcount only includes complete transactions ```
| eval percentage = eventcount * 100 / totalcount

 

 The strange thing is - I tried singling this transaction this way:

 

event_status="delivered" OR event_status="in_transit" AND "C11900891311111"
| eval _time=if(event_status="delivered", strptime(event_time, "%Y-%m-%dT%H:%M:%S.%3N%Z"), _time)
| sort - _time
| transaction tracking_number startswith=event_status=delivered endswith=event_status=in_transit keepevicted=true
| where duration > 3600
| stats sum(eventcount) as eventcount by closed_txn

 

And successfully found it!
But removing condition AND "C11900891311111" at the start of the query gives me 0 results. This is very strange, right?

P.S. Sorry for my continuous inquiries, I just really want to get to the root of the problem. I will upvote your answers as well!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this more generic query.

(event_status="delivered" OR event_status="in_transit") AND tracking_number=*
| eval _time=if(event_status="delivered", strptime(event_time, "%Y-%m-%dT%H:%M:%S.%3N%Z"), _time)
| sort - _time
| transaction tracking_number startswith=event_status=delivered endswith=event_status=in_transit keepevicted=true
| where duration > 3600
| stats sum(eventcount) as eventcount by closed_txn
---
If this reply helps you, Karma would be appreciated.

heorhii12412
Explorer

Unfortunately, this didn't work for me. Still, no idea why those events are not found

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...