I have a use case to calculate time difference between four events. The first event is when the server receives a request and so you will see "Received" in the log with timestamp. The server sends out the request in a second event and so you will see "Sending" in the log with timestamp. The server receives reply from request sent to remote server and so you will see "Received Reply" in log and time stamp and then lastly, the server sends out response to original request and so you will see "Replying". See below log detail for each event
A. 20180610 00:00:31.636 line#68 TRANSACTION TRACE [ tx12345 ] [ ServiceManager ] [ Received ] [ReturnCode: 0 ]
B. 20180610 00:00:31.745 line#68 TRANSACTION TRACE [ tx12345 ] [ ServiceManager ] [ Sending ] [ReturnCode: 0 ]
C. 20180610 00:00:32.416 line#68 TRANSACTION TRACE [ tx12345 ] [ ServiceManager ] [ Recieved Reply ] [ReturnCode: 0 ]
D. 20180610 00:00:32.432 line#68 TRANSACTION TRACE [ tx12345 ] [ ServiceManager ] [ Replying ] [ReturnCode: 0 ]
The transaction ID (tx12345) is unique per transaction but different for another transaction.
We want to always calculate the time difference for
1. Time difference between A &D ( i.e 00:00:31.636 and 00:00:32.432). so we can always get the total time it took the server to receive request, process and provide reply.
2. Time difference between A & B and B & C
Is this possible at all??
Well , what you need to do here is 2 things-
Time difference between A & B and B & C - That is sending - received & received reply - sending.
So all you need to do is get A's time value in B's row and do B-A and the same with C-B
See, the use of streamstats here to find previous row's value - https://answers.splunk.com/answers/150909/how-can-i-get-the-previous-value-of-the-field-that-im-comp...
For the 1st question - Time difference between A &D , its a bot more complex , but still possible.
Pseudo code -
If event row contains Received - do nothing
If event row contains Sending - use streamstats to get the timestamp from the received row (name it as receivedtime for example). now, B-A = current row's time - receivedtime
If event row contains Received reply - use streamstats to get previous of receivedtime (name it as prvreceivedtime) AND the previous timestamp of the sending row (name it as sendingtime) C-B= current row's time - sendingtime
If row contains received replying - use streamstats to get previous value of prvreceivedtime (name it as receivedtimeorgnl) and now D-A = current row's time - receivedtime_orgnl
You can use something like
...|transaction <transaction_id_field> mvlist=t to group the events by their transaction ID. Once you've done that you can use mvindex to get the time of the desired event for each transaction. For example
|eval received_time= mvindex(<your_time_field>, 1)| eval replying_time=mvindex(<your_time_field>,-1) | eval total_response_time = replying_time - received_time. This example would compute D-A, but the same can be done for B-A,C-B using their respective index in the transaction. You may need to play with commands like
tonumber() and the time format to do the subtraction.
Also, it doesn't look like you'll have this problem so I didn't mention it, but if your transactions are not grouped in chronological order for some reason, you can use
mvfind() to better index within a transaction. If this is a problem for you I can post a more detailed explanation.
Also I realized I messed up the indexing, it starts from 0 not 1 (-1 is still valid to index the last entry). Sorry for the confusion.
Thanks for the feedback. I have used
| transaction TxID mvlist=t | eval received_time= mvindex(Received, 0) | eval replying_time=mvindex(Replying,-1) | eval total_response_time = replying_time - received_time| timechart avg(duration) and
transaction host startswith="Received" endswith="Replying"| timechart avg(duration)
and they both worked. However, I read that using "transaction" is quite expensive from a performance perspective. Is this correct? If so do we have any other way to do this?
Transaction can be a performance inhibitor, I always try to reduce the data as much as possible ahead of time to mitigate this (using =value, where, etc.). That being said, I don't have an immediate solution to handle this problem another way, perhaps investigate the other comment on this post. Best of luck.