Splunk Search

sum multiple session duration

gcusello
SplunkTrust
SplunkTrust

Hi at all,
I have a very strange problem that I'm trying to solve.
I have a data source with the following fields:

  • user
  • dest_ip
  • start_time
  • end_time

I have to understand how long a user used network connections in each hour.
the problem is that I could have parallel sessions that I cannor sum because I could have more than 60 minutes of connection in one hour and it isn't acceptable.
In addition I could have a connection from 10.05 to 10.10 and another from 10.45 to 10.50 so I cannot take the start of the first and the end of the second.

Someone can hint how to approach the problem?

Ciao.
Giuseppe

0 Karma
1 Solution

to4kawa
Ultra Champion

Latest Query(To aggregate per day):

| makeresults 
| eval _raw="user,dest_ip,bytes,start_time,end_time
A,10.10.10.10,35980247,24/01/2020 14:02,25/01/2020 01:18
A,10.10.10.10,3797665,24/01/2020 14:02,24/01/2020 15:19
A,10.10.10.10,552,24/01/2020 14:14,24/01/2020 14:15
A,10.10.10.10,3200,24/01/2020 14:01,24/01/2020 14:02
A,10.10.10.10,11373297,23/01/2020 13:57,24/01/2020 00:26
A,10.10.10.10,73590782,23/01/2020 09:59,23/01/2020 23:31
A,10.10.10.10,1203724,23/01/2020 13:57,23/01/2020 14:26
A,10.10.10.10,3134,23/01/2020 13:57,23/01/2020 13:57
A,10.10.10.10,8302783,23/01/2020 09:59,23/01/2020 13:31
A,10.10.10.10,276,23/01/2020 10:20,23/01/2020 10:20
A,10.10.10.10,3104,23/01/2020 09:58,23/01/2020 09:59
A,10.10.10.10,3054,23/01/2020 09:59,23/01/2020 09:59 
B,10.10.10.2,4859519,17/01/2020 14:21,17/01/2020 17:19
B,10.10.10.2,3370174,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,238523,17/01/2020 14:21,17/01/2020 16:19
B,10.10.10.2,3166705,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,2783036,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,3174766,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,552,17/01/2020 14:33,17/01/2020 14:34
B,10.10.10.2,24396,17/01/2020 15:56,17/01/2020 15:59
B,10.10.10.2,675660,17/01/2020 15:57,17/01/2020 16:33
B,10.10.10.2,162019,17/01/2020 15:57,17/01/2020 16:05
B,10.10.10.2,6388,17/01/2020 15:57,17/01/2020 15:59
B,10.10.10.2,398887,17/01/2020 16:00,17/01/2020 16:13
B,10.10.10.2,294,17/01/2020 16:00,17/01/2020 16:00
B,10.10.10.2,35324,17/01/2020 16:01,17/01/2020 16:04
B,10.10.10.2,294,17/01/2020 16:06,17/01/2020 16:07
B,10.10.10.2,181888,17/01/2020 16:11,17/01/2020 17:07
B,10.10.10.2,25668,17/01/2020 16:11,17/01/2020 16:14
B,10.10.10.2,517341,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,518877,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,80488,17/01/2020 16:27,17/01/2020 16:33
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,190050,17/01/2020 16:28,17/01/2020 16:33
B,10.10.10.2,39429,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,158281,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,39731,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,40338,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,100225,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,23526,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,28910,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,40860,17/01/2020 16:39,17/01/2020 16:43
B,10.10.10.2,14456,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,15816,17/01/2020 16:39,17/01/2020 16:42
C,10.10.10.2,6354,24/01/2020 13:36,24/01/2020 13:36
C,10.10.10.2,552,24/01/2020 13:55,24/01/2020 13:55
C,10.10.10.2,82751,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,254943,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,56569,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,104603,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,88522,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,168563,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,11535,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,8114,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,4691,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,25653,27/01/2020 09:07,27/01/2020 09:21
C,10.10.10.2,66339,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,52219,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,26527,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,7349,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,19109,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,20854,27/01/2020 09:09,27/01/2020 09:11
D,10.10.10.3,46783593,20/01/2020 15:27,20/01/2020 15:52
D,10.10.10.4,49106343,20/01/2020 15:27,20/01/2020 16:23
D,10.10.10.5,1033049,20/01/2020 15:38,20/01/2020 16:23
D,10.10.10.6,287151,20/01/2020 16:23,20/01/2020 16:24
D,10.10.10.4,132,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.4,62,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.5,4405811,20/01/2020 16:24,20/01/2020 16:59
D,10.10.10.4,18716525,20/01/2020 16:24,20/01/2020 16:44
D,10.10.10.7,40444,21/01/2020 09:21,21/01/2020 09:22
D,10.10.10.7,173988,21/01/2020 09:22,21/01/2020 09:24
D,10.10.10.7,2755,21/01/2020 09:24,21/01/2020 09:24
D,10.10.10.3,1148,21/01/2020 09:45,21/01/2020 09:46
D,10.10.10.3,7131,21/01/2020 09:46,21/01/2020 09:54
D,10.10.10.3,35138614,21/01/2020 09:54,21/01/2020 10:47" 
| multikv forceheader=1 
| table user,dest_ip,bytes,start_time,end_time 
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
    | eval end_time=end_time+60] 
| sort user start_time
| eval user_dest=user.":".dest_ip
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
| eval change_flag= if(t_end <= start_time,"over",NULL) 
| streamstats count(eval(change_flag="over")) as session by user_dest 
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
| eval duration = end_time - start_time
| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date
| eval time_d =round(relative_time(start_time,"+1d@d")) 
| eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d) 
| stats sum(duration) as duration by date user_dest 
| table user_dest duration date 
| eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s") 
| foreach *_* 
    [ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0) 
    | eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)] 
| table user dest duration date
| sort user dest date

Precondition:

Since start_time and end_time are the same, and strptime is difficult to make duration, 60sec is added to end_time uniformly.

Way of thinking:

  1. Until end_time < start_time , same session par user and dest_ip.
  2. In the session, find minimum start_time and maximum end_time.
  3. For each user ,dest_ip and session, calculate duration.
  4. For par a day, create multivalue of days.
  5. expand days.
  6. Aggregate user and day.

This is hard issue.

| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date

I want to praise myself here.

note: Since I do not consider the period between 31st and 1st, we need to create fields for display if necessary.

View solution in original post

to4kawa
Ultra Champion

Latest Query(To aggregate per day):

| makeresults 
| eval _raw="user,dest_ip,bytes,start_time,end_time
A,10.10.10.10,35980247,24/01/2020 14:02,25/01/2020 01:18
A,10.10.10.10,3797665,24/01/2020 14:02,24/01/2020 15:19
A,10.10.10.10,552,24/01/2020 14:14,24/01/2020 14:15
A,10.10.10.10,3200,24/01/2020 14:01,24/01/2020 14:02
A,10.10.10.10,11373297,23/01/2020 13:57,24/01/2020 00:26
A,10.10.10.10,73590782,23/01/2020 09:59,23/01/2020 23:31
A,10.10.10.10,1203724,23/01/2020 13:57,23/01/2020 14:26
A,10.10.10.10,3134,23/01/2020 13:57,23/01/2020 13:57
A,10.10.10.10,8302783,23/01/2020 09:59,23/01/2020 13:31
A,10.10.10.10,276,23/01/2020 10:20,23/01/2020 10:20
A,10.10.10.10,3104,23/01/2020 09:58,23/01/2020 09:59
A,10.10.10.10,3054,23/01/2020 09:59,23/01/2020 09:59 
B,10.10.10.2,4859519,17/01/2020 14:21,17/01/2020 17:19
B,10.10.10.2,3370174,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,238523,17/01/2020 14:21,17/01/2020 16:19
B,10.10.10.2,3166705,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,2783036,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,3174766,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,552,17/01/2020 14:33,17/01/2020 14:34
B,10.10.10.2,24396,17/01/2020 15:56,17/01/2020 15:59
B,10.10.10.2,675660,17/01/2020 15:57,17/01/2020 16:33
B,10.10.10.2,162019,17/01/2020 15:57,17/01/2020 16:05
B,10.10.10.2,6388,17/01/2020 15:57,17/01/2020 15:59
B,10.10.10.2,398887,17/01/2020 16:00,17/01/2020 16:13
B,10.10.10.2,294,17/01/2020 16:00,17/01/2020 16:00
B,10.10.10.2,35324,17/01/2020 16:01,17/01/2020 16:04
B,10.10.10.2,294,17/01/2020 16:06,17/01/2020 16:07
B,10.10.10.2,181888,17/01/2020 16:11,17/01/2020 17:07
B,10.10.10.2,25668,17/01/2020 16:11,17/01/2020 16:14
B,10.10.10.2,517341,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,518877,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,80488,17/01/2020 16:27,17/01/2020 16:33
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,190050,17/01/2020 16:28,17/01/2020 16:33
B,10.10.10.2,39429,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,158281,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,39731,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,40338,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,100225,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,23526,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,28910,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,40860,17/01/2020 16:39,17/01/2020 16:43
B,10.10.10.2,14456,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,15816,17/01/2020 16:39,17/01/2020 16:42
C,10.10.10.2,6354,24/01/2020 13:36,24/01/2020 13:36
C,10.10.10.2,552,24/01/2020 13:55,24/01/2020 13:55
C,10.10.10.2,82751,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,254943,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,56569,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,104603,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,88522,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,168563,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,11535,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,8114,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,4691,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,25653,27/01/2020 09:07,27/01/2020 09:21
C,10.10.10.2,66339,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,52219,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,26527,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,7349,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,19109,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,20854,27/01/2020 09:09,27/01/2020 09:11
D,10.10.10.3,46783593,20/01/2020 15:27,20/01/2020 15:52
D,10.10.10.4,49106343,20/01/2020 15:27,20/01/2020 16:23
D,10.10.10.5,1033049,20/01/2020 15:38,20/01/2020 16:23
D,10.10.10.6,287151,20/01/2020 16:23,20/01/2020 16:24
D,10.10.10.4,132,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.4,62,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.5,4405811,20/01/2020 16:24,20/01/2020 16:59
D,10.10.10.4,18716525,20/01/2020 16:24,20/01/2020 16:44
D,10.10.10.7,40444,21/01/2020 09:21,21/01/2020 09:22
D,10.10.10.7,173988,21/01/2020 09:22,21/01/2020 09:24
D,10.10.10.7,2755,21/01/2020 09:24,21/01/2020 09:24
D,10.10.10.3,1148,21/01/2020 09:45,21/01/2020 09:46
D,10.10.10.3,7131,21/01/2020 09:46,21/01/2020 09:54
D,10.10.10.3,35138614,21/01/2020 09:54,21/01/2020 10:47" 
| multikv forceheader=1 
| table user,dest_ip,bytes,start_time,end_time 
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
    | eval end_time=end_time+60] 
| sort user start_time
| eval user_dest=user.":".dest_ip
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
| eval change_flag= if(t_end <= start_time,"over",NULL) 
| streamstats count(eval(change_flag="over")) as session by user_dest 
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
| eval duration = end_time - start_time
| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date
| eval time_d =round(relative_time(start_time,"+1d@d")) 
| eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d) 
| stats sum(duration) as duration by date user_dest 
| table user_dest duration date 
| eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s") 
| foreach *_* 
    [ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0) 
    | eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)] 
| table user dest duration date
| sort user dest date

Precondition:

Since start_time and end_time are the same, and strptime is difficult to make duration, 60sec is added to end_time uniformly.

Way of thinking:

  1. Until end_time < start_time , same session par user and dest_ip.
  2. In the session, find minimum start_time and maximum end_time.
  3. For each user ,dest_ip and session, calculate duration.
  4. For par a day, create multivalue of days.
  5. expand days.
  6. Aggregate user and day.

This is hard issue.

| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date

I want to praise myself here.

note: Since I do not consider the period between 31st and 1st, we need to create fields for display if necessary.

to4kawa
Ultra Champion

hi @gcusello
This is my latest query.
So you made the tally on a daily basis.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Thank you very much, you solved a big problem for me!
Just one thing, for the rest of the Community, put a clear indication of what the final solution version is because I struggled a bit to find it among the many!
Thanks again.

Ciao.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
I have to ask you for a little extra effort because there is still an error condition that I can't find.
During debugging I noticed an error condition:
using the following limited data:

 bytes             dest_ip            end_time            start_time            user
 35980247    10.10.10.10    25/01/2020 01:18    24/01/2020 14:02    A
 3797665    10.10.10.10    24/01/2020 15:19    24/01/2020 14:02    A
 552            10.10.10.10    24/01/2020 14:15    24/01/2020 14:14    A
 3200            10.10.10.10    24/01/2020 14:02    24/01/2020 14:01    A
 11373297    10.10.10.10    24/01/2020 00:26    23/01/2020 13:57    A
 73590782    10.10.10.10    23/01/2020 23:31    23/01/2020 09:59    A
 1203724    10.10.10.10    23/01/2020 14:26    23/01/2020 13:57    A
 3134            10.10.10.10    23/01/2020 13:57    23/01/2020 13:57    A
 8302783    10.10.10.10    23/01/2020 13:31    23/01/2020 09:59    A
 276            10.10.10.10    23/01/2020 10:20    23/01/2020 10:20    A
 3104            10.10.10.10    23/01/2020 09:59    23/01/2020 09:58    A
 3054            10.10.10.10    23/01/2020 09:59    23/01/2020 09:59    A

I have no results and debugging I found that the problem is in the "session" field that for this user has value=0 and in the first row there aren't p_start and t_end.

bytes   dest_ip end_time    p_start session start_time  t_end   user    user_dest
35980247    10.10.10.10 1579911600      0   1579870920      UserA   UserA:10.10.10.10
3797665 10.10.10.10 1579875660  1579870920  0   1579870920  1579911600  UserA   UserA:10.10.10.10
552 10.10.10.10 1579871820  1579870920  0   1579871640  1579911600  UserA   UserA:10.10.10.10
3200    10.10.10.10 1579871040  1579870920  0   1579870860  1579911600  UserA   UserA:10.10.10.10
11373297    10.10.10.10 1579822080  1579870860  0   1579784220  1579911600  UserA   UserA:10.10.10.10
73590782    10.10.10.10 1579818780  1579784220  0   1579769940  1579911600  UserA   UserA:10.10.10.10
1203724 10.10.10.10 1579786080  1579769940  0   1579784220  1579911600  UserA   UserA:10.10.10.10
3134    10.10.10.10 1579784340  1579769940  0   1579784220  1579911600  UserA   UserA:10.10.10.10
8302783 10.10.10.10 1579782780  1579769940  0   1579769940  1579911600  UserA   UserA:10.10.10.10
276 10.10.10.10 1579771320  1579769940  0   1579771200  1579911600  UserA   UserA:10.10.10.10
3104    10.10.10.10 1579770060  1579769940  0   1579769880  1579911600  UserA   UserA:10.10.10.10
3054    10.10.10.10 1579770060  1579769880  0   1579769940  1579911600  UserA   UserA:10.10.10.10

Where's the problem?

Thank you again for your patience.
Bye.
Giuseppe

0 Karma

to4kawa
Ultra Champion
|makeresults
| eval _raw="bytes,dest_ip,end_time,start_time,user
35980247,10.10.10.10,25/01/2020 01:18,24/01/2020 14:02,A
3797665,10.10.10.10,24/01/2020 15:19,24/01/2020 14:02,A
552,10.10.10.10,24/01/2020 14:15,24/01/2020 14:14,A
3200,10.10.10.10,24/01/2020 14:02,24/01/2020 14:01,A
11373297,10.10.10.10,24/01/2020 00:26,23/01/2020 13:57,A
73590782,10.10.10.10,23/01/2020 23:31,23/01/2020 09:59,A
1203724,10.10.10.10,23/01/2020 14:26,23/01/2020 13:57,A
3134,10.10.10.10,23/01/2020 13:57,23/01/2020 13:57,A
8302783,10.10.10.10,23/01/2020 13:31,23/01/2020 09:59,A
276,10.10.10.10,23/01/2020 10:20,23/01/2020 10:20,A
3104,10.10.10.10,23/01/2020 09:59,23/01/2020 09:58,A
3054,10.10.10.10,23/01/2020 09:59,23/01/2020 09:59,A"
 | multikv forceheader=1 
 | table user,dest_ip,bytes,start_time,end_time 
 | foreach *_time 
     [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
     | eval end_time=end_time+60] 
 | sort user start_time
 | eval user_dest=user.":".dest_ip
 | streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
 | eval change_flag= if(t_end <= start_time,"over",NULL) 
 | streamstats count(eval(change_flag="over")) as session by user_dest 
 | stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
 | eval duration = end_time - start_time
 | eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
 | mvexpand date
 | eval time_d =round(relative_time(start_time,"+1d@d")) 
 | eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d) 
 | stats sum(duration) as duration by date user_dest 
 | table user_dest duration date 
 | eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s") 
 | foreach *_* 
     [ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0) 
     | eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)] 
 | table user dest duration date
 | sort user dest date
0 Karma

to4kawa
Ultra Champion

I'm sorry @gcusello

Latest query is right answer.
Do not use reverse version.
I think it's okay now that my answers are organized.

0 Karma

to4kawa
Ultra Champion

To aggregate par user and dest_ip but no consider over a day:

| table user,dest_ip,bytes,start_time,end_time 
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
    | eval end_time=end_time+60] 
| eval user_dest=user.":".dest_ip 
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
| eval change_flag= if(t_end <= start_time,"over",NULL) 
| streamstats count(eval(change_flag="over")) as session by user_dest 
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
| eval duration = end_time - start_time
| eval time_tmp=if(strftime(start_time,"%H")!=strftime(end_time,"%H"),mvrange(tonumber(strftime(start_time,"%H")),tonumber(strftime(end_time,"%H"))+1),0)
|mvexpand time_tmp
| eval duration_ph=case(time_tmp=0,duration
,tonumber(strftime(start_time,"%H"))=time_tmp,round(relative_time(start_time,"+1h@h")) - start_time
,tonumber(strftime(end_time,"%H"))=time_tmp,end_time - round(relative_time(end_time,"@h")) 
,true(),3600)
| eval _time = if(time_tmp=0,relative_time(start_time,"@h"), strptime(strftime(start_time,"%F")." ".time_tmp,"%F %H"))
| table _time duration_ph user_dest
0 Karma

to4kawa
Ultra Champion

Pre version:

....
| table user,dest_ip,bytes,start_time,end_time 
`comment("this is sample you provide. From here, the logic")`
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M"))
    | eval end_time=end_time+60] 
| streamstats window=1 current=f list(start_time) as p_start list(end_time) as t_end by user
| eval change_flag= if(t_end <= start_time,"over",NULL)
| streamstats count(eval(change_flag="over")) as session by user
| stats min(start_time) as start_time max(end_time) as end_time by user session
| eval duration = end_time - start_time
| stats sum(duration) as duration by user

Precondition:

Since start_time and end_time are the same, and strptime is difficult to make duration, 60sec is added to end_time uniformly.

Way of thinking:

  1. Until end_time < start_time , same session.
  2. In the session, find minimum start_time and maximum end_time.
  3. For each user and session, calculate duration.
  4. Aggregated by user.

Hi @gcusello

It was a very rewarding issue.
How about this?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Thank you for your help, I think that your approach is correct but this isn't yet the final solution!
replacing the last row of the search with

 | bin span=1h start_time
 | chart sum(duration) as duration OVER start_time BY user
 | eval start_time=strftime(start_time,"%Y-%m-%d %H:%M")

I see that sometimes (user B at 2020-01-17 14:00) there are more than 3600 seconds in an hour and this is impossible.

Can you give more ideas? Anyway, I'll continue to explore your solution.

Ciao and thank you.
Giuseppe

0 Karma

to4kawa
Ultra Champion

hi, @gcusello
my query is made for aggregation between search range, not for aggregation par hour.
It is not work well.

For par dest_ip,
we should modify | eval change_flag= if(t_end <= start_time,"over",NULL)
OR | streamstats window=1 current=f list(start_time) as p_start list(end_time) as t_end by user dest_ip
now, I can't try this. sorry.

In my query, the duration of user B is from 17/01/2020 14:21 to 17/01/2020 17:19.
many accesses are included in it.

I'm not sure par hour.
In this case,
14:00~ 15:00 39min
15:00~ 16:00 60min
16:00~ 17:00 60min
17:00~ 18:00 19min
Is this right?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
the check for hour is important to understand if the solution is correct because I cannot have more than 60 minutes in an hour.
Anyway it's right the last position:
14:00~ 15:00 39min
15:00~ 16:00 60min
16:00~ 17:00 60min
17:00~ 18:00 19min
I tried to implement on your code but I haven't succeeded yet.

Thank you.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Your search is quite OK, I have only one issue: I need to divide results for days:
I found that there could be connections that start in one day and and in the following, I need to divide durations by day.
These are a subset of data to test this thing:

bytes            dest_ip            end_time            start_time          user
35980247    10.10.10.10.    25/01/2020 01:18    24/01/2020 14:02    A
3797665 10.10.10.10.    24/01/2020 15:19    24/01/2020 14:02    A
552         10.10.10.10.    24/01/2020 14:15    24/01/2020 14:14    A
3200            10.10.10.10.    24/01/2020 14:02    24/01/2020 14:01    A
11373297    10.10.10.10.    24/01/2020 00:26    23/01/2020 13:57    A
73590782    10.10.10.10.    23/01/2020 23:31    23/01/2020 09:59    A
1203724 10.10.10.10.    23/01/2020 14:26    23/01/2020 13:57    A
3134            10.10.10.10.    23/01/2020 13:57    23/01/2020 13:57    A
8302783 10.10.10.10.    23/01/2020 13:31    23/01/2020 09:59    A
276         10.10.10.10.    23/01/2020 10:20    23/01/2020 10:20    A
3104            10.10.10.10.    23/01/2020 09:59    23/01/2020 09:58    A
3054            10.10.10.10.    23/01/2020 09:59    23/01/2020 09:59    A

This is the your search I used:

 my_search
| foreach *_time [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M"))
| eval b_<<MATCHSTR>> = strftime('<<FIELD>>',"%d/%m %H:%M")]
| table user dest_ip bytes b_start start_time b_end end_time
| streamstats min(start_time) as start_p max(end_time) as end_p by user
| streamstats min(eval(min(end_p,start_time))) as start_p  max(eval(max(end_p,end_time))) as end_p by user dest_ip
| table user dest_ip bytes b_start start_time start_p b_end end_time end_p
| eval start_check =nullif(start_time >= start_p AND end_time >= end_p,null())
| streamstats dc(eval(mvzip(start_p,end_p))) as flag
| dedup flag
| eval date=strftime(start_time,"%d")
| streamstats min(start_time) as start_p max(end_time) as end_p current=f by user dest_ip date
| reverse
| dedup start_time user
| reverse
| eval duration=end_time - start_time
| stats sum(duration) as duration max(date) as date by user dest_ip
| convert rmunit(duration) as diff
| eval diff=round(diff)
| eval difference=tostring(diff,"duration")
| eval difference = replace(difference,"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s")
| table user dest_ip difference date

Ciao and Thank you.
Giuseppe

0 Karma

arjunpkishore5
Motivator

Hi @gcusello

Give this a shot, slight tweaking might be needed

base_ search for events, time converted to epoch.
this query is duration per user. if you need per user per dest_ip, please add dest_ip to all the group_by and sort
What i'm doing here is basically removing overlapping timelines and grouping them into one continuous session

base_search
| fields user, dest_ip, start_time, end_time
| sort user, dest_ip, -start_time
| streamstats max(start_time) as next_start by user, dest_ip window=1 reset_on_change=true current=false
| sort user, dest_ip, start_time
| eval next_start=coalesce(next_start, start_time), row_group=1
| where end_time>next_start
| streamstats sum(row_group) as row_group by user, dest_ip reset_on_change=true 
| eval row_group=(floor(coalesce(row_group,0)/2))
| stats min(start_time) as start_time, max(end_time) as end_time by user, dest_ip, row_group
| eval duration=end_time-start_time
| rename start_time as _time
| bucket span=1h
| stats sum(duration) as duration by user, dest_ip, _time
| eval over_time=if(duration>60, duration-60, 0)
| streamstats max(over_time) as add_time by user, dest_ip window=1 reset_on_change=true current=false
| fillnull value=0 add_time
| eval duration=duration + add_time - over_time

Hope this helps

Cheers

gcusello
SplunkTrust
SplunkTrust

Hi @arjunpkishore5,
Thank you for your help, I think that your approach is correct but this isn't yet the final solution!
replacing the last row of the search with

  | bin span=1h start_time
  | chart sum(duration) as duration OVER start_time BY user
  | eval start_time=strftime(start_time,"%Y-%m-%d %H:%M")

I see that sometimes (user B at 2020-01-17 14:00) there are more than 3600 seconds in an hour and this is impossible.

Can you give more ideas? Anyway, I'll continue to explore your solution.

Ciao and thank you.
Giuseppe

0 Karma

arjunpkishore5
Motivator

Edited my answer to accommodate this condition (lines 13 onward) i'm calculating over_time which is the duration over 60 and adding it to the next hours time. I haven't tested the query, just going with the algorithm in my head and converting it to code. Some fixes may be needed. 🙂

Cheers

0 Karma

arjunpkishore5
Motivator

I also added dest_ip to the group by 🙂

0 Karma

to4kawa
Ultra Champion
| makeresults
| eval _raw="user,dest_ip,start_time,end_time
A,X.X.X.X,2020/02/13 10:05,2020/02/13 10:10
A,X.X.X.X,2020/02/13 10:45,2020/02/13 10:50"
| multikv forceheader=1
| table user,dest_ip,start_time,end_time
| foreach *_time [ eval <<FIELD>> = strptime('<<FIELD>>', "%Y/%m/%d %H:%M")]
| fieldformat start_time = strftime(start_time, "%Y/%m/%d %H:%M") 
| fieldformat end_time = strftime(end_time, "%Y/%m/%d %H:%M") 
| addinfo
| eval duration = coalesce(end_time,info_max_time) - coalesce(start_time,info_min_time)
| eventstats sum(duration) as total_duration by user
| table user,dest_ip,start_time,end_time duration total_duration

hi, @gcusello
Anyway, I made a query.
I considered the case of starting and ending outside the range of time piker.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Thank you for your help.
The problem isn't to find the duration of each transaction because in each row I have start_time and end_time, the problem is that I usually have more parallel sessions so if I sum all duration I have more than the time period, instead I need to know how long the user used the connection.

Below, you can find some example of my data

user dest_ip    bytes   start_time  end_time
A   10.10.10.1  330 16/01/2020 09:30    16/01/2020 09:31
B   10.10.10.2  4859519 17/01/2020 14:21    17/01/2020 17:19
B   10.10.10.2  3370174 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  238523  17/01/2020 14:21    17/01/2020 16:19
B   10.10.10.2  3166705 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  2783036 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  3174766 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  552 17/01/2020 14:33    17/01/2020 14:34
B   10.10.10.2  24396   17/01/2020 15:56    17/01/2020 15:59
B   10.10.10.2  675660  17/01/2020 15:57    17/01/2020 16:33
B   10.10.10.2  162019  17/01/2020 15:57    17/01/2020 16:05
B   10.10.10.2  6388    17/01/2020 15:57    17/01/2020 15:59
B   10.10.10.2  398887  17/01/2020 16:00    17/01/2020 16:13
B   10.10.10.2  294 17/01/2020 16:00    17/01/2020 16:00
B   10.10.10.2  35324   17/01/2020 16:01    17/01/2020 16:04
B   10.10.10.2  294 17/01/2020 16:06    17/01/2020 16:07
B   10.10.10.2  181888  17/01/2020 16:11    17/01/2020 17:07
B   10.10.10.2  25668   17/01/2020 16:11    17/01/2020 16:14
B   10.10.10.2  517341  17/01/2020 16:19    17/01/2020 16:33
B   10.10.10.2  518877  17/01/2020 16:19    17/01/2020 16:33
B   10.10.10.2  80488   17/01/2020 16:27    17/01/2020 16:33
B   10.10.10.2  294 17/01/2020 16:27    17/01/2020 16:28
B   10.10.10.2  294 17/01/2020 16:27    17/01/2020 16:28
B   10.10.10.2  190050  17/01/2020 16:28    17/01/2020 16:33
B   10.10.10.2  39429   17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  158281  17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  39731   17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  40338   17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  100225  17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  23526   17/01/2020 16:39    17/01/2020 16:42
B   10.10.10.2  28910   17/01/2020 16:39    17/01/2020 16:42
B   10.10.10.2  40860   17/01/2020 16:39    17/01/2020 16:43
B   10.10.10.2  14456   17/01/2020 16:39    17/01/2020 16:42
B   10.10.10.2  15816   17/01/2020 16:39    17/01/2020 16:42
C   10.10.10.2  6354    24/01/2020 13:36    24/01/2020 13:36
C   10.10.10.2  552 24/01/2020 13:55    24/01/2020 13:55
C   10.10.10.2  82751   27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  254943  27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  56569   27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  104603  27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  88522   27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  168563  27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  11535   27/01/2020 09:07    27/01/2020 09:09
C   10.10.10.2  8114    27/01/2020 09:07    27/01/2020 09:09
C   10.10.10.2  4691    27/01/2020 09:07    27/01/2020 09:09
C   10.10.10.2  25653   27/01/2020 09:07    27/01/2020 09:21
C   10.10.10.2  66339   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  52219   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  26527   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  7349    27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  19109   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  20854   27/01/2020 09:09    27/01/2020 09:11
D   10.10.10.3  46783593    20/01/2020 15:27    20/01/2020 15:52
D   10.10.10.4  49106343    20/01/2020 15:27    20/01/2020 16:23
D   10.10.10.5  1033049 20/01/2020 15:38    20/01/2020 16:23
D   10.10.10.6  287151  20/01/2020 16:23    20/01/2020 16:24
D   10.10.10.4  132 20/01/2020 16:24    20/01/2020 16:24
D   10.10.10.4  62  20/01/2020 16:24    20/01/2020 16:24
D   10.10.10.5  4405811 20/01/2020 16:24    20/01/2020 16:59
D   10.10.10.4  18716525    20/01/2020 16:24    20/01/2020 16:44
D   10.10.10.7  40444   21/01/2020 09:21    21/01/2020 09:22
D   10.10.10.7  173988  21/01/2020 09:22    21/01/2020 09:24
D   10.10.10.7  2755    21/01/2020 09:24    21/01/2020 09:24
D   10.10.10.3  1148    21/01/2020 09:45    21/01/2020 09:46
D   10.10.10.3  7131    21/01/2020 09:46    21/01/2020 09:54
D   10.10.10.3  35138614    21/01/2020 09:54    21/01/2020 10:47

Ciao.
Giuseppe

0 Karma

to4kawa
Ultra Champion
 C    10.10.10.2    6354    24/01/2020 13:36    24/01/2020 13:36

Does this row has seconds?

0 Karma

gcusello
SplunkTrust
SplunkTrust

No time format is %d/%m/%Y %H:%M

Ciao.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...