Hi at all,
I have a very strange problem that I'm trying to solve.
I have a data source with the following fields:
I have to understand how long a user used network connections in each hour.
the problem is that I could have parallel sessions that I cannor sum because I could have more than 60 minutes of connection in one hour and it isn't acceptable.
In addition I could have a connection from 10.05 to 10.10 and another from 10.45 to 10.50 so I cannot take the start of the first and the end of the second.
Someone can hint how to approach the problem?
Ciao.
Giuseppe
Latest Query(To aggregate per day):
| makeresults
| eval _raw="user,dest_ip,bytes,start_time,end_time
A,10.10.10.10,35980247,24/01/2020 14:02,25/01/2020 01:18
A,10.10.10.10,3797665,24/01/2020 14:02,24/01/2020 15:19
A,10.10.10.10,552,24/01/2020 14:14,24/01/2020 14:15
A,10.10.10.10,3200,24/01/2020 14:01,24/01/2020 14:02
A,10.10.10.10,11373297,23/01/2020 13:57,24/01/2020 00:26
A,10.10.10.10,73590782,23/01/2020 09:59,23/01/2020 23:31
A,10.10.10.10,1203724,23/01/2020 13:57,23/01/2020 14:26
A,10.10.10.10,3134,23/01/2020 13:57,23/01/2020 13:57
A,10.10.10.10,8302783,23/01/2020 09:59,23/01/2020 13:31
A,10.10.10.10,276,23/01/2020 10:20,23/01/2020 10:20
A,10.10.10.10,3104,23/01/2020 09:58,23/01/2020 09:59
A,10.10.10.10,3054,23/01/2020 09:59,23/01/2020 09:59
B,10.10.10.2,4859519,17/01/2020 14:21,17/01/2020 17:19
B,10.10.10.2,3370174,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,238523,17/01/2020 14:21,17/01/2020 16:19
B,10.10.10.2,3166705,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,2783036,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,3174766,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,552,17/01/2020 14:33,17/01/2020 14:34
B,10.10.10.2,24396,17/01/2020 15:56,17/01/2020 15:59
B,10.10.10.2,675660,17/01/2020 15:57,17/01/2020 16:33
B,10.10.10.2,162019,17/01/2020 15:57,17/01/2020 16:05
B,10.10.10.2,6388,17/01/2020 15:57,17/01/2020 15:59
B,10.10.10.2,398887,17/01/2020 16:00,17/01/2020 16:13
B,10.10.10.2,294,17/01/2020 16:00,17/01/2020 16:00
B,10.10.10.2,35324,17/01/2020 16:01,17/01/2020 16:04
B,10.10.10.2,294,17/01/2020 16:06,17/01/2020 16:07
B,10.10.10.2,181888,17/01/2020 16:11,17/01/2020 17:07
B,10.10.10.2,25668,17/01/2020 16:11,17/01/2020 16:14
B,10.10.10.2,517341,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,518877,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,80488,17/01/2020 16:27,17/01/2020 16:33
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,190050,17/01/2020 16:28,17/01/2020 16:33
B,10.10.10.2,39429,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,158281,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,39731,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,40338,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,100225,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,23526,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,28910,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,40860,17/01/2020 16:39,17/01/2020 16:43
B,10.10.10.2,14456,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,15816,17/01/2020 16:39,17/01/2020 16:42
C,10.10.10.2,6354,24/01/2020 13:36,24/01/2020 13:36
C,10.10.10.2,552,24/01/2020 13:55,24/01/2020 13:55
C,10.10.10.2,82751,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,254943,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,56569,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,104603,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,88522,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,168563,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,11535,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,8114,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,4691,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,25653,27/01/2020 09:07,27/01/2020 09:21
C,10.10.10.2,66339,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,52219,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,26527,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,7349,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,19109,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,20854,27/01/2020 09:09,27/01/2020 09:11
D,10.10.10.3,46783593,20/01/2020 15:27,20/01/2020 15:52
D,10.10.10.4,49106343,20/01/2020 15:27,20/01/2020 16:23
D,10.10.10.5,1033049,20/01/2020 15:38,20/01/2020 16:23
D,10.10.10.6,287151,20/01/2020 16:23,20/01/2020 16:24
D,10.10.10.4,132,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.4,62,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.5,4405811,20/01/2020 16:24,20/01/2020 16:59
D,10.10.10.4,18716525,20/01/2020 16:24,20/01/2020 16:44
D,10.10.10.7,40444,21/01/2020 09:21,21/01/2020 09:22
D,10.10.10.7,173988,21/01/2020 09:22,21/01/2020 09:24
D,10.10.10.7,2755,21/01/2020 09:24,21/01/2020 09:24
D,10.10.10.3,1148,21/01/2020 09:45,21/01/2020 09:46
D,10.10.10.3,7131,21/01/2020 09:46,21/01/2020 09:54
D,10.10.10.3,35138614,21/01/2020 09:54,21/01/2020 10:47"
| multikv forceheader=1
| table user,dest_ip,bytes,start_time,end_time
| foreach *_time
[ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M"))
| eval end_time=end_time+60]
| sort user start_time
| eval user_dest=user.":".dest_ip
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest
| eval change_flag= if(t_end <= start_time,"over",NULL)
| streamstats count(eval(change_flag="over")) as session by user_dest
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session
| eval duration = end_time - start_time
| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d"))
| mvexpand date
| eval time_d =round(relative_time(start_time,"+1d@d"))
| eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d)
| stats sum(duration) as duration by date user_dest
| table user_dest duration date
| eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s")
| foreach *_*
[ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0)
| eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)]
| table user dest duration date
| sort user dest date
Precondition:
Since start_time and end_time are the same, and strptime
is difficult to make duration, 60sec is added to end_time uniformly.
Way of thinking:
This is hard issue.
| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d"))
| mvexpand date
I want to praise myself here.
note:
Since I do not consider the period between 31st and 1st, we need to create fields for display if necessary.
Can you provide sample logs?
does the dest_ip matter? is it per user per dest_ip or just per user ?
per user per dest_ip.
Ciao.
Giuseppe