Hey, quite a long post, but I'm going crazy here trying to solve this problem:
I have a connection log of:
id, username, start_time, end_time
there are many users and I want to be able to analyze connection times for those users. ie top users that are connected at different times of the day (night, morning, afternoon). also looking at specific users and seeing their overall connection plotted over a 24 hour chart.
So, I went on a quest to find relating posts, found a few, with this helpful one:
timechart-of-open-sessions-per-user
But here I don't have seperate "login" and "logout" events, it's all in one event.
What I did (for the time being for one user)
... | eval dif=1 | append [search ... | eval _time=strptime(end_time,"%Y-%m-%d %T") | eval dif=-1 ] | sort +_time | streamstats sum(dif) as openSessions by username | timechart max(openSessions) by username | filldown
Problem #1:
VERY VERY VERY weird thing, looks to me like a bug - in every search (which has more than a 100 events in it) there is one event missing from the search that I appended. and it screws all the stream stats up, i get something like:
00:00:00 logon
01:00:00 logoff
01:30:00 logon
01:50:00 logoff
02:30:00 logon
04:00:00 logon
04:30:00 logoff
between 2:30 and 4:00 there was a logoff, and it's there in the 2:30 event, let's say at 03:00. but for some reason it doesn't get appended. What the hell?
Problem #2:
Even if I make this go away, it still doesn't solve my wish to get statistics over all of the users. what i would like to get is a table of all the users something like,
user/hour 00 01 02 03 04 ... 23
user1
user2
Every cell would say the number of times the user was connected at that time:
That append would really kill it, and I also don't know how to do those sums withot specifically stating all the users somehow in the stats that i will do on the timechart. Is there a way?
Do you have some sample event data? Could build you a sample query on that.
Basically you could just write two field extractions (using rex per search or build it so it's at index time) to define new fields for the start_time and end_time in each event then it's just a matter of using eval to calculate the difference (in secs/mins/hours) between the two and populate a new field login_duration or whatever.
EDITED:
Sorry I was lazy in my original post and on my mobile... here's a sample search to use as reference:
First off my dataset (tried to guess yours):
7829 joshd Jun 18 21:26:22 Jun 18 22:35:27
8293 john Jun 18 22:37:21 Jun 18 22:42:44
Search:
... | rex field=_raw "\S+\s\d+\s\d+:\d+:\d+\s(?<end_time>\S+\s\d+\s\d+\:\d+\:\d+)" | eval end_time=strptime(end_time,"%b %d %H:%M:%S") | eval login_duration=end_time-_time
You'll now see a field named login_duration that is the amount of time between the logins in seconds. Oh and you'll notice I simply used _time instead of extracting two individual fields (it was a lazy cheat) but if you dont want to use _time then define a second command line field extraction using another rex command same to what I did above just reverse the order.
That doesn't answer the question. I already did what you did here. My problem is looking at the times in between start and end and analyzing them. Not only the duration.