Splunk Search

sum multiple session duration

gcusello
SplunkTrust
SplunkTrust

Hi at all,
I have a very strange problem that I'm trying to solve.
I have a data source with the following fields:

  • user
  • dest_ip
  • start_time
  • end_time

I have to understand how long a user used network connections in each hour.
the problem is that I could have parallel sessions that I cannor sum because I could have more than 60 minutes of connection in one hour and it isn't acceptable.
In addition I could have a connection from 10.05 to 10.10 and another from 10.45 to 10.50 so I cannot take the start of the first and the end of the second.

Someone can hint how to approach the problem?

Ciao.
Giuseppe

0 Karma
1 Solution

to4kawa
Ultra Champion

Latest Query(To aggregate per day):

| makeresults 
| eval _raw="user,dest_ip,bytes,start_time,end_time
A,10.10.10.10,35980247,24/01/2020 14:02,25/01/2020 01:18
A,10.10.10.10,3797665,24/01/2020 14:02,24/01/2020 15:19
A,10.10.10.10,552,24/01/2020 14:14,24/01/2020 14:15
A,10.10.10.10,3200,24/01/2020 14:01,24/01/2020 14:02
A,10.10.10.10,11373297,23/01/2020 13:57,24/01/2020 00:26
A,10.10.10.10,73590782,23/01/2020 09:59,23/01/2020 23:31
A,10.10.10.10,1203724,23/01/2020 13:57,23/01/2020 14:26
A,10.10.10.10,3134,23/01/2020 13:57,23/01/2020 13:57
A,10.10.10.10,8302783,23/01/2020 09:59,23/01/2020 13:31
A,10.10.10.10,276,23/01/2020 10:20,23/01/2020 10:20
A,10.10.10.10,3104,23/01/2020 09:58,23/01/2020 09:59
A,10.10.10.10,3054,23/01/2020 09:59,23/01/2020 09:59 
B,10.10.10.2,4859519,17/01/2020 14:21,17/01/2020 17:19
B,10.10.10.2,3370174,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,238523,17/01/2020 14:21,17/01/2020 16:19
B,10.10.10.2,3166705,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,2783036,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,3174766,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,552,17/01/2020 14:33,17/01/2020 14:34
B,10.10.10.2,24396,17/01/2020 15:56,17/01/2020 15:59
B,10.10.10.2,675660,17/01/2020 15:57,17/01/2020 16:33
B,10.10.10.2,162019,17/01/2020 15:57,17/01/2020 16:05
B,10.10.10.2,6388,17/01/2020 15:57,17/01/2020 15:59
B,10.10.10.2,398887,17/01/2020 16:00,17/01/2020 16:13
B,10.10.10.2,294,17/01/2020 16:00,17/01/2020 16:00
B,10.10.10.2,35324,17/01/2020 16:01,17/01/2020 16:04
B,10.10.10.2,294,17/01/2020 16:06,17/01/2020 16:07
B,10.10.10.2,181888,17/01/2020 16:11,17/01/2020 17:07
B,10.10.10.2,25668,17/01/2020 16:11,17/01/2020 16:14
B,10.10.10.2,517341,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,518877,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,80488,17/01/2020 16:27,17/01/2020 16:33
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,190050,17/01/2020 16:28,17/01/2020 16:33
B,10.10.10.2,39429,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,158281,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,39731,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,40338,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,100225,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,23526,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,28910,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,40860,17/01/2020 16:39,17/01/2020 16:43
B,10.10.10.2,14456,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,15816,17/01/2020 16:39,17/01/2020 16:42
C,10.10.10.2,6354,24/01/2020 13:36,24/01/2020 13:36
C,10.10.10.2,552,24/01/2020 13:55,24/01/2020 13:55
C,10.10.10.2,82751,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,254943,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,56569,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,104603,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,88522,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,168563,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,11535,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,8114,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,4691,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,25653,27/01/2020 09:07,27/01/2020 09:21
C,10.10.10.2,66339,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,52219,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,26527,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,7349,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,19109,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,20854,27/01/2020 09:09,27/01/2020 09:11
D,10.10.10.3,46783593,20/01/2020 15:27,20/01/2020 15:52
D,10.10.10.4,49106343,20/01/2020 15:27,20/01/2020 16:23
D,10.10.10.5,1033049,20/01/2020 15:38,20/01/2020 16:23
D,10.10.10.6,287151,20/01/2020 16:23,20/01/2020 16:24
D,10.10.10.4,132,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.4,62,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.5,4405811,20/01/2020 16:24,20/01/2020 16:59
D,10.10.10.4,18716525,20/01/2020 16:24,20/01/2020 16:44
D,10.10.10.7,40444,21/01/2020 09:21,21/01/2020 09:22
D,10.10.10.7,173988,21/01/2020 09:22,21/01/2020 09:24
D,10.10.10.7,2755,21/01/2020 09:24,21/01/2020 09:24
D,10.10.10.3,1148,21/01/2020 09:45,21/01/2020 09:46
D,10.10.10.3,7131,21/01/2020 09:46,21/01/2020 09:54
D,10.10.10.3,35138614,21/01/2020 09:54,21/01/2020 10:47" 
| multikv forceheader=1 
| table user,dest_ip,bytes,start_time,end_time 
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
    | eval end_time=end_time+60] 
| sort user start_time
| eval user_dest=user.":".dest_ip
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
| eval change_flag= if(t_end <= start_time,"over",NULL) 
| streamstats count(eval(change_flag="over")) as session by user_dest 
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
| eval duration = end_time - start_time
| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date
| eval time_d =round(relative_time(start_time,"+1d@d")) 
| eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d) 
| stats sum(duration) as duration by date user_dest 
| table user_dest duration date 
| eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s") 
| foreach *_* 
    [ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0) 
    | eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)] 
| table user dest duration date
| sort user dest date

Precondition:

Since start_time and end_time are the same, and strptime is difficult to make duration, 60sec is added to end_time uniformly.

Way of thinking:

  1. Until end_time < start_time , same session par user and dest_ip.
  2. In the session, find minimum start_time and maximum end_time.
  3. For each user ,dest_ip and session, calculate duration.
  4. For par a day, create multivalue of days.
  5. expand days.
  6. Aggregate user and day.

This is hard issue.

| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date

I want to praise myself here.

note: Since I do not consider the period between 31st and 1st, we need to create fields for display if necessary.

View solution in original post

to4kawa
Ultra Champion

Latest Query(To aggregate per day):

| makeresults 
| eval _raw="user,dest_ip,bytes,start_time,end_time
A,10.10.10.10,35980247,24/01/2020 14:02,25/01/2020 01:18
A,10.10.10.10,3797665,24/01/2020 14:02,24/01/2020 15:19
A,10.10.10.10,552,24/01/2020 14:14,24/01/2020 14:15
A,10.10.10.10,3200,24/01/2020 14:01,24/01/2020 14:02
A,10.10.10.10,11373297,23/01/2020 13:57,24/01/2020 00:26
A,10.10.10.10,73590782,23/01/2020 09:59,23/01/2020 23:31
A,10.10.10.10,1203724,23/01/2020 13:57,23/01/2020 14:26
A,10.10.10.10,3134,23/01/2020 13:57,23/01/2020 13:57
A,10.10.10.10,8302783,23/01/2020 09:59,23/01/2020 13:31
A,10.10.10.10,276,23/01/2020 10:20,23/01/2020 10:20
A,10.10.10.10,3104,23/01/2020 09:58,23/01/2020 09:59
A,10.10.10.10,3054,23/01/2020 09:59,23/01/2020 09:59 
B,10.10.10.2,4859519,17/01/2020 14:21,17/01/2020 17:19
B,10.10.10.2,3370174,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,238523,17/01/2020 14:21,17/01/2020 16:19
B,10.10.10.2,3166705,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,2783036,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,3174766,17/01/2020 14:21,17/01/2020 15:39
B,10.10.10.2,552,17/01/2020 14:33,17/01/2020 14:34
B,10.10.10.2,24396,17/01/2020 15:56,17/01/2020 15:59
B,10.10.10.2,675660,17/01/2020 15:57,17/01/2020 16:33
B,10.10.10.2,162019,17/01/2020 15:57,17/01/2020 16:05
B,10.10.10.2,6388,17/01/2020 15:57,17/01/2020 15:59
B,10.10.10.2,398887,17/01/2020 16:00,17/01/2020 16:13
B,10.10.10.2,294,17/01/2020 16:00,17/01/2020 16:00
B,10.10.10.2,35324,17/01/2020 16:01,17/01/2020 16:04
B,10.10.10.2,294,17/01/2020 16:06,17/01/2020 16:07
B,10.10.10.2,181888,17/01/2020 16:11,17/01/2020 17:07
B,10.10.10.2,25668,17/01/2020 16:11,17/01/2020 16:14
B,10.10.10.2,517341,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,518877,17/01/2020 16:19,17/01/2020 16:33
B,10.10.10.2,80488,17/01/2020 16:27,17/01/2020 16:33
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,294,17/01/2020 16:27,17/01/2020 16:28
B,10.10.10.2,190050,17/01/2020 16:28,17/01/2020 16:33
B,10.10.10.2,39429,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,158281,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,39731,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,40338,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,100225,17/01/2020 16:36,17/01/2020 16:39
B,10.10.10.2,23526,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,28910,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,40860,17/01/2020 16:39,17/01/2020 16:43
B,10.10.10.2,14456,17/01/2020 16:39,17/01/2020 16:42
B,10.10.10.2,15816,17/01/2020 16:39,17/01/2020 16:42
C,10.10.10.2,6354,24/01/2020 13:36,24/01/2020 13:36
C,10.10.10.2,552,24/01/2020 13:55,24/01/2020 13:55
C,10.10.10.2,82751,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,254943,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,56569,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,104603,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,88522,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,168563,27/01/2020 09:06,27/01/2020 09:09
C,10.10.10.2,11535,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,8114,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,4691,27/01/2020 09:07,27/01/2020 09:09
C,10.10.10.2,25653,27/01/2020 09:07,27/01/2020 09:21
C,10.10.10.2,66339,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,52219,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,26527,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,7349,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,19109,27/01/2020 09:09,27/01/2020 09:11
C,10.10.10.2,20854,27/01/2020 09:09,27/01/2020 09:11
D,10.10.10.3,46783593,20/01/2020 15:27,20/01/2020 15:52
D,10.10.10.4,49106343,20/01/2020 15:27,20/01/2020 16:23
D,10.10.10.5,1033049,20/01/2020 15:38,20/01/2020 16:23
D,10.10.10.6,287151,20/01/2020 16:23,20/01/2020 16:24
D,10.10.10.4,132,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.4,62,20/01/2020 16:24,20/01/2020 16:24
D,10.10.10.5,4405811,20/01/2020 16:24,20/01/2020 16:59
D,10.10.10.4,18716525,20/01/2020 16:24,20/01/2020 16:44
D,10.10.10.7,40444,21/01/2020 09:21,21/01/2020 09:22
D,10.10.10.7,173988,21/01/2020 09:22,21/01/2020 09:24
D,10.10.10.7,2755,21/01/2020 09:24,21/01/2020 09:24
D,10.10.10.3,1148,21/01/2020 09:45,21/01/2020 09:46
D,10.10.10.3,7131,21/01/2020 09:46,21/01/2020 09:54
D,10.10.10.3,35138614,21/01/2020 09:54,21/01/2020 10:47" 
| multikv forceheader=1 
| table user,dest_ip,bytes,start_time,end_time 
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
    | eval end_time=end_time+60] 
| sort user start_time
| eval user_dest=user.":".dest_ip
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
| eval change_flag= if(t_end <= start_time,"over",NULL) 
| streamstats count(eval(change_flag="over")) as session by user_dest 
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
| eval duration = end_time - start_time
| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date
| eval time_d =round(relative_time(start_time,"+1d@d")) 
| eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d) 
| stats sum(duration) as duration by date user_dest 
| table user_dest duration date 
| eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s") 
| foreach *_* 
    [ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0) 
    | eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)] 
| table user dest duration date
| sort user dest date

Precondition:

Since start_time and end_time are the same, and strptime is difficult to make duration, 60sec is added to end_time uniformly.

Way of thinking:

  1. Until end_time < start_time , same session par user and dest_ip.
  2. In the session, find minimum start_time and maximum end_time.
  3. For each user ,dest_ip and session, calculate duration.
  4. For par a day, create multivalue of days.
  5. expand days.
  6. Aggregate user and day.

This is hard issue.

| eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
| mvexpand date

I want to praise myself here.

note: Since I do not consider the period between 31st and 1st, we need to create fields for display if necessary.

to4kawa
Ultra Champion

hi @gcusello
This is my latest query.
So you made the tally on a daily basis.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Thank you very much, you solved a big problem for me!
Just one thing, for the rest of the Community, put a clear indication of what the final solution version is because I struggled a bit to find it among the many!
Thanks again.

Ciao.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
I have to ask you for a little extra effort because there is still an error condition that I can't find.
During debugging I noticed an error condition:
using the following limited data:

 bytes             dest_ip            end_time            start_time            user
 35980247    10.10.10.10    25/01/2020 01:18    24/01/2020 14:02    A
 3797665    10.10.10.10    24/01/2020 15:19    24/01/2020 14:02    A
 552            10.10.10.10    24/01/2020 14:15    24/01/2020 14:14    A
 3200            10.10.10.10    24/01/2020 14:02    24/01/2020 14:01    A
 11373297    10.10.10.10    24/01/2020 00:26    23/01/2020 13:57    A
 73590782    10.10.10.10    23/01/2020 23:31    23/01/2020 09:59    A
 1203724    10.10.10.10    23/01/2020 14:26    23/01/2020 13:57    A
 3134            10.10.10.10    23/01/2020 13:57    23/01/2020 13:57    A
 8302783    10.10.10.10    23/01/2020 13:31    23/01/2020 09:59    A
 276            10.10.10.10    23/01/2020 10:20    23/01/2020 10:20    A
 3104            10.10.10.10    23/01/2020 09:59    23/01/2020 09:58    A
 3054            10.10.10.10    23/01/2020 09:59    23/01/2020 09:59    A

I have no results and debugging I found that the problem is in the "session" field that for this user has value=0 and in the first row there aren't p_start and t_end.

bytes   dest_ip end_time    p_start session start_time  t_end   user    user_dest
35980247    10.10.10.10 1579911600      0   1579870920      UserA   UserA:10.10.10.10
3797665 10.10.10.10 1579875660  1579870920  0   1579870920  1579911600  UserA   UserA:10.10.10.10
552 10.10.10.10 1579871820  1579870920  0   1579871640  1579911600  UserA   UserA:10.10.10.10
3200    10.10.10.10 1579871040  1579870920  0   1579870860  1579911600  UserA   UserA:10.10.10.10
11373297    10.10.10.10 1579822080  1579870860  0   1579784220  1579911600  UserA   UserA:10.10.10.10
73590782    10.10.10.10 1579818780  1579784220  0   1579769940  1579911600  UserA   UserA:10.10.10.10
1203724 10.10.10.10 1579786080  1579769940  0   1579784220  1579911600  UserA   UserA:10.10.10.10
3134    10.10.10.10 1579784340  1579769940  0   1579784220  1579911600  UserA   UserA:10.10.10.10
8302783 10.10.10.10 1579782780  1579769940  0   1579769940  1579911600  UserA   UserA:10.10.10.10
276 10.10.10.10 1579771320  1579769940  0   1579771200  1579911600  UserA   UserA:10.10.10.10
3104    10.10.10.10 1579770060  1579769940  0   1579769880  1579911600  UserA   UserA:10.10.10.10
3054    10.10.10.10 1579770060  1579769880  0   1579769940  1579911600  UserA   UserA:10.10.10.10

Where's the problem?

Thank you again for your patience.
Bye.
Giuseppe

0 Karma

to4kawa
Ultra Champion
|makeresults
| eval _raw="bytes,dest_ip,end_time,start_time,user
35980247,10.10.10.10,25/01/2020 01:18,24/01/2020 14:02,A
3797665,10.10.10.10,24/01/2020 15:19,24/01/2020 14:02,A
552,10.10.10.10,24/01/2020 14:15,24/01/2020 14:14,A
3200,10.10.10.10,24/01/2020 14:02,24/01/2020 14:01,A
11373297,10.10.10.10,24/01/2020 00:26,23/01/2020 13:57,A
73590782,10.10.10.10,23/01/2020 23:31,23/01/2020 09:59,A
1203724,10.10.10.10,23/01/2020 14:26,23/01/2020 13:57,A
3134,10.10.10.10,23/01/2020 13:57,23/01/2020 13:57,A
8302783,10.10.10.10,23/01/2020 13:31,23/01/2020 09:59,A
276,10.10.10.10,23/01/2020 10:20,23/01/2020 10:20,A
3104,10.10.10.10,23/01/2020 09:59,23/01/2020 09:58,A
3054,10.10.10.10,23/01/2020 09:59,23/01/2020 09:59,A"
 | multikv forceheader=1 
 | table user,dest_ip,bytes,start_time,end_time 
 | foreach *_time 
     [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
     | eval end_time=end_time+60] 
 | sort user start_time
 | eval user_dest=user.":".dest_ip
 | streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
 | eval change_flag= if(t_end <= start_time,"over",NULL) 
 | streamstats count(eval(change_flag="over")) as session by user_dest 
 | stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
 | eval duration = end_time - start_time
 | eval date=if(strftime(start_time,"%d")!=strftime(end_time,"%d"),mvrange(tonumber(strftime(start_time,"%d")),tonumber(strftime(end_time,"%d"))+1),strftime(start_time,"%d")) 
 | mvexpand date
 | eval time_d =round(relative_time(start_time,"+1d@d")) 
 | eval duration=if(date=tonumber(strftime(start_time,"%d")), time_d - start_time,start_time + duration - time_d) 
 | stats sum(duration) as duration by date user_dest 
 | table user_dest duration date 
 | eval duration=replace(tostring(duration,"duration"),"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s") 
 | foreach *_* 
     [ eval <<MATCHSEG1>>=mvindex(split(user_dest,":"),0) 
     | eval <<MATCHSEG2>>=mvindex(split(user_dest,":"),1)] 
 | table user dest duration date
 | sort user dest date
0 Karma

to4kawa
Ultra Champion

I'm sorry @gcusello

Latest query is right answer.
Do not use reverse version.
I think it's okay now that my answers are organized.

0 Karma

to4kawa
Ultra Champion

To aggregate par user and dest_ip but no consider over a day:

| table user,dest_ip,bytes,start_time,end_time 
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M")) 
    | eval end_time=end_time+60] 
| eval user_dest=user.":".dest_ip 
| streamstats current=f min(start_time) as p_start max(end_time) as t_end by user_dest 
| eval change_flag= if(t_end <= start_time,"over",NULL) 
| streamstats count(eval(change_flag="over")) as session by user_dest 
| stats min(start_time) as start_time max(end_time) as end_time by user_dest session 
| eval duration = end_time - start_time
| eval time_tmp=if(strftime(start_time,"%H")!=strftime(end_time,"%H"),mvrange(tonumber(strftime(start_time,"%H")),tonumber(strftime(end_time,"%H"))+1),0)
|mvexpand time_tmp
| eval duration_ph=case(time_tmp=0,duration
,tonumber(strftime(start_time,"%H"))=time_tmp,round(relative_time(start_time,"+1h@h")) - start_time
,tonumber(strftime(end_time,"%H"))=time_tmp,end_time - round(relative_time(end_time,"@h")) 
,true(),3600)
| eval _time = if(time_tmp=0,relative_time(start_time,"@h"), strptime(strftime(start_time,"%F")." ".time_tmp,"%F %H"))
| table _time duration_ph user_dest
0 Karma

to4kawa
Ultra Champion

Pre version:

....
| table user,dest_ip,bytes,start_time,end_time 
`comment("this is sample you provide. From here, the logic")`
| foreach *_time 
    [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M"))
    | eval end_time=end_time+60] 
| streamstats window=1 current=f list(start_time) as p_start list(end_time) as t_end by user
| eval change_flag= if(t_end <= start_time,"over",NULL)
| streamstats count(eval(change_flag="over")) as session by user
| stats min(start_time) as start_time max(end_time) as end_time by user session
| eval duration = end_time - start_time
| stats sum(duration) as duration by user

Precondition:

Since start_time and end_time are the same, and strptime is difficult to make duration, 60sec is added to end_time uniformly.

Way of thinking:

  1. Until end_time < start_time , same session.
  2. In the session, find minimum start_time and maximum end_time.
  3. For each user and session, calculate duration.
  4. Aggregated by user.

Hi @gcusello

It was a very rewarding issue.
How about this?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Thank you for your help, I think that your approach is correct but this isn't yet the final solution!
replacing the last row of the search with

 | bin span=1h start_time
 | chart sum(duration) as duration OVER start_time BY user
 | eval start_time=strftime(start_time,"%Y-%m-%d %H:%M")

I see that sometimes (user B at 2020-01-17 14:00) there are more than 3600 seconds in an hour and this is impossible.

Can you give more ideas? Anyway, I'll continue to explore your solution.

Ciao and thank you.
Giuseppe

0 Karma

to4kawa
Ultra Champion

hi, @gcusello
my query is made for aggregation between search range, not for aggregation par hour.
It is not work well.

For par dest_ip,
we should modify | eval change_flag= if(t_end <= start_time,"over",NULL)
OR | streamstats window=1 current=f list(start_time) as p_start list(end_time) as t_end by user dest_ip
now, I can't try this. sorry.

In my query, the duration of user B is from 17/01/2020 14:21 to 17/01/2020 17:19.
many accesses are included in it.

I'm not sure par hour.
In this case,
14:00~ 15:00 39min
15:00~ 16:00 60min
16:00~ 17:00 60min
17:00~ 18:00 19min
Is this right?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
the check for hour is important to understand if the solution is correct because I cannot have more than 60 minutes in an hour.
Anyway it's right the last position:
14:00~ 15:00 39min
15:00~ 16:00 60min
16:00~ 17:00 60min
17:00~ 18:00 19min
I tried to implement on your code but I haven't succeeded yet.

Thank you.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Your search is quite OK, I have only one issue: I need to divide results for days:
I found that there could be connections that start in one day and and in the following, I need to divide durations by day.
These are a subset of data to test this thing:

bytes            dest_ip            end_time            start_time          user
35980247    10.10.10.10.    25/01/2020 01:18    24/01/2020 14:02    A
3797665 10.10.10.10.    24/01/2020 15:19    24/01/2020 14:02    A
552         10.10.10.10.    24/01/2020 14:15    24/01/2020 14:14    A
3200            10.10.10.10.    24/01/2020 14:02    24/01/2020 14:01    A
11373297    10.10.10.10.    24/01/2020 00:26    23/01/2020 13:57    A
73590782    10.10.10.10.    23/01/2020 23:31    23/01/2020 09:59    A
1203724 10.10.10.10.    23/01/2020 14:26    23/01/2020 13:57    A
3134            10.10.10.10.    23/01/2020 13:57    23/01/2020 13:57    A
8302783 10.10.10.10.    23/01/2020 13:31    23/01/2020 09:59    A
276         10.10.10.10.    23/01/2020 10:20    23/01/2020 10:20    A
3104            10.10.10.10.    23/01/2020 09:59    23/01/2020 09:58    A
3054            10.10.10.10.    23/01/2020 09:59    23/01/2020 09:59    A

This is the your search I used:

 my_search
| foreach *_time [ eval <<FIELD>> = round(strptime('<<FIELD>>',"%d/%m/%Y %H:%M"))
| eval b_<<MATCHSTR>> = strftime('<<FIELD>>',"%d/%m %H:%M")]
| table user dest_ip bytes b_start start_time b_end end_time
| streamstats min(start_time) as start_p max(end_time) as end_p by user
| streamstats min(eval(min(end_p,start_time))) as start_p  max(eval(max(end_p,end_time))) as end_p by user dest_ip
| table user dest_ip bytes b_start start_time start_p b_end end_time end_p
| eval start_check =nullif(start_time >= start_p AND end_time >= end_p,null())
| streamstats dc(eval(mvzip(start_p,end_p))) as flag
| dedup flag
| eval date=strftime(start_time,"%d")
| streamstats min(start_time) as start_p max(end_time) as end_p current=f by user dest_ip date
| reverse
| dedup start_time user
| reverse
| eval duration=end_time - start_time
| stats sum(duration) as duration max(date) as date by user dest_ip
| convert rmunit(duration) as diff
| eval diff=round(diff)
| eval difference=tostring(diff,"duration")
| eval difference = replace(difference,"(\d+)\:(\d+)\:(\d+)","\1h \2min \3s")
| table user dest_ip difference date

Ciao and Thank you.
Giuseppe

0 Karma

arjunpkishore5
Motivator

Hi @gcusello

Give this a shot, slight tweaking might be needed

base_ search for events, time converted to epoch.
this query is duration per user. if you need per user per dest_ip, please add dest_ip to all the group_by and sort
What i'm doing here is basically removing overlapping timelines and grouping them into one continuous session

base_search
| fields user, dest_ip, start_time, end_time
| sort user, dest_ip, -start_time
| streamstats max(start_time) as next_start by user, dest_ip window=1 reset_on_change=true current=false
| sort user, dest_ip, start_time
| eval next_start=coalesce(next_start, start_time), row_group=1
| where end_time>next_start
| streamstats sum(row_group) as row_group by user, dest_ip reset_on_change=true 
| eval row_group=(floor(coalesce(row_group,0)/2))
| stats min(start_time) as start_time, max(end_time) as end_time by user, dest_ip, row_group
| eval duration=end_time-start_time
| rename start_time as _time
| bucket span=1h
| stats sum(duration) as duration by user, dest_ip, _time
| eval over_time=if(duration>60, duration-60, 0)
| streamstats max(over_time) as add_time by user, dest_ip window=1 reset_on_change=true current=false
| fillnull value=0 add_time
| eval duration=duration + add_time - over_time

Hope this helps

Cheers

gcusello
SplunkTrust
SplunkTrust

Hi @arjunpkishore5,
Thank you for your help, I think that your approach is correct but this isn't yet the final solution!
replacing the last row of the search with

  | bin span=1h start_time
  | chart sum(duration) as duration OVER start_time BY user
  | eval start_time=strftime(start_time,"%Y-%m-%d %H:%M")

I see that sometimes (user B at 2020-01-17 14:00) there are more than 3600 seconds in an hour and this is impossible.

Can you give more ideas? Anyway, I'll continue to explore your solution.

Ciao and thank you.
Giuseppe

0 Karma

arjunpkishore5
Motivator

Edited my answer to accommodate this condition (lines 13 onward) i'm calculating over_time which is the duration over 60 and adding it to the next hours time. I haven't tested the query, just going with the algorithm in my head and converting it to code. Some fixes may be needed. 🙂

Cheers

0 Karma

arjunpkishore5
Motivator

I also added dest_ip to the group by 🙂

0 Karma

to4kawa
Ultra Champion
| makeresults
| eval _raw="user,dest_ip,start_time,end_time
A,X.X.X.X,2020/02/13 10:05,2020/02/13 10:10
A,X.X.X.X,2020/02/13 10:45,2020/02/13 10:50"
| multikv forceheader=1
| table user,dest_ip,start_time,end_time
| foreach *_time [ eval <<FIELD>> = strptime('<<FIELD>>', "%Y/%m/%d %H:%M")]
| fieldformat start_time = strftime(start_time, "%Y/%m/%d %H:%M") 
| fieldformat end_time = strftime(end_time, "%Y/%m/%d %H:%M") 
| addinfo
| eval duration = coalesce(end_time,info_max_time) - coalesce(start_time,info_min_time)
| eventstats sum(duration) as total_duration by user
| table user,dest_ip,start_time,end_time duration total_duration

hi, @gcusello
Anyway, I made a query.
I considered the case of starting and ending outside the range of time piker.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @to4kawa,
Thank you for your help.
The problem isn't to find the duration of each transaction because in each row I have start_time and end_time, the problem is that I usually have more parallel sessions so if I sum all duration I have more than the time period, instead I need to know how long the user used the connection.

Below, you can find some example of my data

user dest_ip    bytes   start_time  end_time
A   10.10.10.1  330 16/01/2020 09:30    16/01/2020 09:31
B   10.10.10.2  4859519 17/01/2020 14:21    17/01/2020 17:19
B   10.10.10.2  3370174 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  238523  17/01/2020 14:21    17/01/2020 16:19
B   10.10.10.2  3166705 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  2783036 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  3174766 17/01/2020 14:21    17/01/2020 15:39
B   10.10.10.2  552 17/01/2020 14:33    17/01/2020 14:34
B   10.10.10.2  24396   17/01/2020 15:56    17/01/2020 15:59
B   10.10.10.2  675660  17/01/2020 15:57    17/01/2020 16:33
B   10.10.10.2  162019  17/01/2020 15:57    17/01/2020 16:05
B   10.10.10.2  6388    17/01/2020 15:57    17/01/2020 15:59
B   10.10.10.2  398887  17/01/2020 16:00    17/01/2020 16:13
B   10.10.10.2  294 17/01/2020 16:00    17/01/2020 16:00
B   10.10.10.2  35324   17/01/2020 16:01    17/01/2020 16:04
B   10.10.10.2  294 17/01/2020 16:06    17/01/2020 16:07
B   10.10.10.2  181888  17/01/2020 16:11    17/01/2020 17:07
B   10.10.10.2  25668   17/01/2020 16:11    17/01/2020 16:14
B   10.10.10.2  517341  17/01/2020 16:19    17/01/2020 16:33
B   10.10.10.2  518877  17/01/2020 16:19    17/01/2020 16:33
B   10.10.10.2  80488   17/01/2020 16:27    17/01/2020 16:33
B   10.10.10.2  294 17/01/2020 16:27    17/01/2020 16:28
B   10.10.10.2  294 17/01/2020 16:27    17/01/2020 16:28
B   10.10.10.2  190050  17/01/2020 16:28    17/01/2020 16:33
B   10.10.10.2  39429   17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  158281  17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  39731   17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  40338   17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  100225  17/01/2020 16:36    17/01/2020 16:39
B   10.10.10.2  23526   17/01/2020 16:39    17/01/2020 16:42
B   10.10.10.2  28910   17/01/2020 16:39    17/01/2020 16:42
B   10.10.10.2  40860   17/01/2020 16:39    17/01/2020 16:43
B   10.10.10.2  14456   17/01/2020 16:39    17/01/2020 16:42
B   10.10.10.2  15816   17/01/2020 16:39    17/01/2020 16:42
C   10.10.10.2  6354    24/01/2020 13:36    24/01/2020 13:36
C   10.10.10.2  552 24/01/2020 13:55    24/01/2020 13:55
C   10.10.10.2  82751   27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  254943  27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  56569   27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  104603  27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  88522   27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  168563  27/01/2020 09:06    27/01/2020 09:09
C   10.10.10.2  11535   27/01/2020 09:07    27/01/2020 09:09
C   10.10.10.2  8114    27/01/2020 09:07    27/01/2020 09:09
C   10.10.10.2  4691    27/01/2020 09:07    27/01/2020 09:09
C   10.10.10.2  25653   27/01/2020 09:07    27/01/2020 09:21
C   10.10.10.2  66339   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  52219   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  26527   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  7349    27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  19109   27/01/2020 09:09    27/01/2020 09:11
C   10.10.10.2  20854   27/01/2020 09:09    27/01/2020 09:11
D   10.10.10.3  46783593    20/01/2020 15:27    20/01/2020 15:52
D   10.10.10.4  49106343    20/01/2020 15:27    20/01/2020 16:23
D   10.10.10.5  1033049 20/01/2020 15:38    20/01/2020 16:23
D   10.10.10.6  287151  20/01/2020 16:23    20/01/2020 16:24
D   10.10.10.4  132 20/01/2020 16:24    20/01/2020 16:24
D   10.10.10.4  62  20/01/2020 16:24    20/01/2020 16:24
D   10.10.10.5  4405811 20/01/2020 16:24    20/01/2020 16:59
D   10.10.10.4  18716525    20/01/2020 16:24    20/01/2020 16:44
D   10.10.10.7  40444   21/01/2020 09:21    21/01/2020 09:22
D   10.10.10.7  173988  21/01/2020 09:22    21/01/2020 09:24
D   10.10.10.7  2755    21/01/2020 09:24    21/01/2020 09:24
D   10.10.10.3  1148    21/01/2020 09:45    21/01/2020 09:46
D   10.10.10.3  7131    21/01/2020 09:46    21/01/2020 09:54
D   10.10.10.3  35138614    21/01/2020 09:54    21/01/2020 10:47

Ciao.
Giuseppe

0 Karma

to4kawa
Ultra Champion
 C    10.10.10.2    6354    24/01/2020 13:36    24/01/2020 13:36

Does this row has seconds?

0 Karma

gcusello
SplunkTrust
SplunkTrust

No time format is %d/%m/%Y %H:%M

Ciao.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...