I'm trying to track the elapsed time it takes a user to complete a web application based on the earliest and latest occurrences of specific messages. The end goal is to have two separate charts, the first, a scatterplot with the date on the X axis and the elapsed times to complete on the Y axis (multiple values present each day); the second chart almost the same but an average of the elapsed time on the Y axis rather than all values.
My query thus far looks like:
sourcetype="PCF:log" cf_app_name=myApp (msg="*launch configuration*" OR msg="*complete for Account Owner*" )
| rex field=msg "AccountNum: (?<AccountNum>\w+)"
| rex field=msg "UserId: (?<UserID>\w+)"
| stats min(_time) as start_time max(_time) as end_time by AccountNum, UserID
| eval time_to_complete = end_time-start_time
| eval time_to_complete=strftime(time_to_complete, "%M:%S")
The shortcomings are:
1) A user can re-launch after completing (launch, complete, launch) and so the min/max _times can be off in these scenarios. I need something like earliest("launch configuration) and latest("complete for Account Owner") but not sure on the syntax for that.
2) The OR condition for the logs returns entries where only launch happens, and never complete. I'd like to discard those from the calculations rather than have a bunch of elapsed time of 00:00 present.
3) I'm not totally sure that this query will track combinations of users/accounts as I need. If User A launches with account 1 but never finishes, and user B launches with account 1 and does finish, I only (ideally) want the elapsed time for user B.
4) I'm generally struggling with how to incorporate the date component into a graph as well.
Sincerely appreciate help, and explanations welcome so I can better learn.