I am trying to find the time taken by our processes. I wrote a basic query that fetch a start, end time, and the difference for a particular interaction. This uses the max and min to find the start and the end times. But I am not sure how to look for multiple process start and end times by looking at the messages.
index=application_na sourcetype=my_logs:hec appl="*" message="***" interactionid=12345 | table interactionid, seq, _time, host, severity, message, msgsource | sort _time
| stats min(_time) as StartTime, max(_time) as EndTime by interactionid
| eval Difference=EndTime-StartTime | fieldformat StartTime=strftime(StartTime, "%Y-%m-%d %H:%M:%S.%3N")
| fieldformat EndTime=strftime(EndTime, "%Y-%m-%d %H:%M:%S.%3N")
| fieldformat Difference=tostring(Difference,"duration")
| table interactionid, StartTime, EndTime, Difference
I have messages that look like this:
interactionid | _time | message |
12345 | 2025-06-26 07:55:56.317 | TimeMarker: WebService: Received request. (DoPayment - ID:1721 Amount:16 Acc:1234) |
12345 | 2025-06-26 07:55:56.717 | OtherApp: -> Sending request with timeout value: 15 |
12345 | 2025-06-26 07:55:57.512 | TimeMarker: OtherApp: Received result from OtherApp (SALE - ID:1721 Amount:16.00 Acc:1234) |
12345 | 2025-06-26 07:55:58.017 | TimeMarker: WebService: Sending result @20234ms. (DoPayment - ID:1721 Amount:16 Acc:1234) |
So, I want to get an output of time taken by `OtherApp` from when it received a request to when it responded back to my app, and then the total time taken by my service `DoPayment`. Is this something achievable.
Output that I am looking for is
interactionid | DoPayment Start | OtherApp Start | OtherApp End | DoPayment End |
Agree with others that your purpose is better served by knowing your data. Given what you have revealed, you can simply describe the four events and lay them out with xyseries.
index=application_na sourcetype=my_logs:hec appl="*" message="***"
| eval event = case(match(message, "Received request"), "DoPayment start",
match(message, "Sending result"), "DoPayment end",
match(message, "Sending request"), "OtherApp start",
match(message, "Received result"), "OtherApp end")
| eval _time = strftime(_time, "%F %T.%3N")
| xyseries interactionid event _time
Obviously regex's used in the match functions are just to illustrate what you can do. But xyseries can achieve what you want without complex transformations. Using your mock data, the output is
interactionid | DoPayment end | DoPaymet start | OtherApp end | OtherApp start |
12345 | 2025-06-26 07:55:58.017 | 2025-06-26 07:55:56.317 | 2025-06-26 07:55:57.512 | 2025-06-26 07:55:56.717 |
Here is an emulation you can play with and compare with real data
| makeresults format=csv data="interactionid,_time,message
12345,2025-06-26 07:55:56.317,TimeMarker: WebService: Received request. (DoPayment - ID:1721 Amount:16 Acc:1234)
12345,2025-06-26 07:55:56.717,OtherApp: -> Sending request with timeout value: 15
12345,2025-06-26 07:55:57.512,TimeMarker: OtherApp: Received result from OtherApp (SALE - ID:1721 Amount:16.00 Acc:1234)
12345,2025-06-26 07:55:58.017,TimeMarker: WebService: Sending result @20234ms. (DoPayment - ID:1721 Amount:16 Acc:1234)"
| eval _time = strptime(_time, "%F %T.%N")
| sort - _time
``` above emulates
index=application_na sourcetype=my_logs:hec appl="*" message="***"
```
Agree with others that your purpose is better served by knowing your data. Given what you have revealed, you can simply describe the four events and lay them out with xyseries.
index=application_na sourcetype=my_logs:hec appl="*" message="***"
| eval event = case(match(message, "Received request"), "DoPayment start",
match(message, "Sending result"), "DoPayment end",
match(message, "Sending request"), "OtherApp start",
match(message, "Received result"), "OtherApp end")
| eval _time = strftime(_time, "%F %T.%3N")
| xyseries interactionid event _time
Obviously regex's used in the match functions are just to illustrate what you can do. But xyseries can achieve what you want without complex transformations. Using your mock data, the output is
interactionid | DoPayment end | DoPaymet start | OtherApp end | OtherApp start |
12345 | 2025-06-26 07:55:58.017 | 2025-06-26 07:55:56.317 | 2025-06-26 07:55:57.512 | 2025-06-26 07:55:56.717 |
Here is an emulation you can play with and compare with real data
| makeresults format=csv data="interactionid,_time,message
12345,2025-06-26 07:55:56.317,TimeMarker: WebService: Received request. (DoPayment - ID:1721 Amount:16 Acc:1234)
12345,2025-06-26 07:55:56.717,OtherApp: -> Sending request with timeout value: 15
12345,2025-06-26 07:55:57.512,TimeMarker: OtherApp: Received result from OtherApp (SALE - ID:1721 Amount:16.00 Acc:1234)
12345,2025-06-26 07:55:58.017,TimeMarker: WebService: Sending result @20234ms. (DoPayment - ID:1721 Amount:16 Acc:1234)"
| eval _time = strptime(_time, "%F %T.%N")
| sort - _time
``` above emulates
index=application_na sourcetype=my_logs:hec appl="*" message="***"
```
From those 4 events, which ones do you want to calculate time between, it's not clear to me. If you have multiple messages and only two of them are relevant to your calculation, then can you not just include search contstraints to only find the 2 you are interested in? If you have only 2 events, then you can use min/max as you are doing.
Otherwise, you can use this type of logic
| sort 0 _time
| streamstats window=2 global=f range(_time) as duration by interactionid
which will sort the events into time ascending order and put a new field into each event with the duration (time gap) between that event and the previous event for the same interactionid.
You could also use eval and stats (which would be faster than streamstats) to set a field with the start time of the event you want to find - same for the end and then use stats to collect those new fields to then calculate duration.
Also, note that you should never sort unless you know you need to. In this case, you don't. Also, sort has a 10,000 result limit and will chop your data to only 10,000 results (maybe not an issue in your case), but get used to using | sort 0 xxx to make sure your entire data set is sorted.
1. Don't use the "table" command for other things than displaying your results in tabular format at the end of your search. Most importantly, don't use it early in your search! It moves the processing to the search-head tier. In your case it prevents Splunk from effectively running stats in a distributed way across indexers.
2. Well, we don't know your data. You've shown some single example with no additional explanation as to the format of your events. The general idea would be to extract some form of a "transactionID" and use streamstats co copy over the start of the transaction to the ending event (or probably the other way around since Splunk by default returns events in reverse chronological order. Then gather some general aggregations from thus preprocessed data.