I am looking at logs in an IIS index. These are events performed by someone who is using a product that we make at the company I work at. So far, when someone logs in we have been using the (custom field) value of action=login to view this event.
In order to work out how long it takes someone to log in, we have simply been using the timetaken field for this action.
However, we have come to realize that what actually happens when someone logs in, is that the action=login starts the process, and then another log/event finishes this process, called aaction=event_status
Is it possible to find the time difference between these two events? I know they both have timestamps, which can be converted in epoch. The difference between these values is all we need, what is the easiest way to calculate this?
(However, we do get onto the issue of making sure that we are looking at the same person instance of the product. We can see this from the cs_username field. If anyone knows how to tackle this issue at the same time that would be hugely convenient, but one issue at a time will suffice for now 🙂 )
Thanks in advance!
You can try something like this
index=iis action=login cs_username=* | stats latest(_time) as time_login by cs_username | join cs_username [ search index=iis a_action=event_status cs_username=* | stats latest(_time) as time_finish by cs_username ] | eval difference=time_finish-time_login | eval difference=strftime(difference,"%d-%m-%Y %H:%M:%S")
Also pass the
earliest= latest= in the base search of both the search as desired.
let me know if this helps!
Thanks! This is really useful.
Do you know if there is any way to ensure that the second event (eventstatus) definitely happens after the first one (login)? And ideally, the first happening of eventstatus after the login event? (As it is a background process that happens fairly often, but is the way we benchmark the login finishing).
This is kind of an add-on question, so as you did answer the initial question I will accept the answer 🙂
well you can use
transaction command to look at the event details
index=iis action=login OR a_action=event_status cs_username=* | transaction cs_username startswith=action=login endswith=a_action=event_status
You can look at the event flow per csusername. and the positive time difference will always ensure that there is the first login and then eventsattus.
This will also create a
duration field which is same as your initial query i.e. time difference.
Couple possible alternatives that might work for you:
index=iis (action="login" OR aaction="eventstatus") csusername=* | eval starttime=if(action=="login",time,null()), endtime=if(aaction=="eventstatus",time,null()) | stats min(starttime) as starttime,max(endtime) as endtime by csusername | eval diff=abs(endtime-starttime)
| multisearch [| search index=iis action=login csusername=*] [| search index=iis aaction=eventstatus csusername=*] | stats earliest(time) as starttime,latest(time) as endtime by csusername | eval diff=abs(endtime-start_time)
Thanks! I couldn't get that first one to work (I think it didn't join the separate events together).
With the second one, it was giving me results but I am not sure it is definitely the time difference between those two specific events, not just any two events for a username.
The transaction command is the simplest way to aggregate related logs. Depending on your log volume and what you want to see, the following will show the 95th percentile of the time between two events.
index=iis cs_username=* | eval action = if(match(a_action, "event_status"), "login_complete", action) | transaction cs_username maxspan=5m startswith="action=login" endswith="action=login_com" | eventstats p95(duration) AS p95_duration `comment("| timechart avg(duration) AS avg_duration, p95(duration) AS p95_duration, max(duration) AS max_duration, min(duration) AS min_duration by cs_username | predict avg_duration p95_duration max_duration")`
If you click on the duration field on the left of the events list, it will show the average, minimum, maximum, and standard deviation. Or you can calculate those with timechart/stats/chart and get a table of values or a visual representation, and use predict to forecast the values.
Using a reasonable maxspan value and startswith will significantly reduce the number of transactions in memory. I think the default maxspan is 1 day, which can cause a large number of evicted records if you have a large log volume.