Okay, I'm new to Splunk -- I'm currently two days deep. I'm attempting to sort users by their duration (duration being the length of time they've spent watching any one video). When I type in: sourcetype=videos | table user duration | sort user duration | reverse
, I end up with the same user all of the way down the column on the left associated with the viewing durations on the right. I want to know if there is a way to create a running total of these durations to post a total time watched per user.
Example output:
user ------ duration
tom.jones 00:51:13
tom.jones 00:31:03
tom.jones 00:15:02
tom.jones 00:08:11
tom.jones 00:02:21
steve.corel 00:41:16
steve.corel 00:30:33
steve.corel 00:22:46
etc.
Thank you in advance.
Normally, one would use the stats command to sum them, except stats only works with numbers and duration is not a number (because of the ':'). A workaround is to convert duration into integer seconds before the stats command and then convert it back before the table command. Something like this:
sourcetype=videos | eval secs=strptime(duration, "%H:%M:%S")-strptime("00:00:00", "%H:%M:%S") | stats sum(secs) as totalSecs by user | sort user totalSecs desc | eval totalDuration=strftime(totalSecs, "%H:%M:%S") | table user totalDuration
First, you'll need a field to hold your times as "timestamps" (integers) so they can be added:
sourcetype=videos | table user duration | sort user duration | eval intDuration=strptime(duration,"%H:%M:%S")
Then, you'll want to use streamstats
to generate the running total for each user:
sourcetype=videos | table user duration | sort user duration | eval intDuration=strptime(duration,"%H:%M:%S") | streamstats sum(intDuration) as TotalDuration by user
Finally, if you want, you can convert the time variable back:
sourcetype=videos | eval intDuration=strptime(duration,"%H:%M:%S") | streamstats sum(intDuration) as TotalDuration by user | eval TotalDurationTime=strftime(TotalDuration,"%H:%M:%S")
Normally, one would use the stats command to sum them, except stats only works with numbers and duration is not a number (because of the ':'). A workaround is to convert duration into integer seconds before the stats command and then convert it back before the table command. Something like this:
sourcetype=videos | eval secs=strptime(duration, "%H:%M:%S")-strptime("00:00:00", "%H:%M:%S") | stats sum(secs) as totalSecs by user | sort user totalSecs desc | eval totalDuration=strftime(totalSecs, "%H:%M:%S") | table user totalDuration
This is exactly what I'm looking for. Thank you! I only have one issue. For some reason my top 5 users are displaying the time of 23:59:59. Why would this be? I calculated the CSV times by user in a spreadsheet, the others are correct.
e.g. user5 should be displaying a time of 8:53:42..
user1 23:59:59
user2 23:59:59
user3 23:59:59
user4 23:59:59
user5 23:59:59
user6 01:47:05
user7 00:50:45
user8 00:40:55
user9 00:21:56
What are the durations for user1?
I narrowed it down. The seconds print out correctly, but when "eval totalDuration=strftime(totalSecs, "%H:%M:%S")" is added the totals do not come out correctly for the top 5.
user totalSecs totalDuration
user 1 148603874617.000000 23:59:59
user 2 84304119903.000000 23:59:59
user 3 67157542422.000000 23:59:59
user 4 51439802027.000000 23:59:59
user 5 35722091866.000000 23:59:59
user 6 32864320734.000000 23:59:59
user 7 21433250455.000000 00:40:55
user 8 12859955225.000000 01:47:05
user 9 12859951693.000000 00:48:13
user 10 10002183716.000000 00:21:56
user11 2857767945.000000 00:25:45
these are the durations for user1:
Duration
0:02:53
0:09:59
0:00:18
0:02:58
0:00:53
0:06:06
0:02:40
0:14:40
0:00:01
0:06:17
0:01:38
0:02:59
0:05:01
0:02:12
0:08:45
0:01:46
0:01:55
0:05:54
0:00:06
0:00:00
0:00:04
0:02:14
0:00:48
0:01:08
0:04:06
0:04:25
0:00:27
0:05:51
0:04:51
0:00:34
0:00:30
0:02:28
0:03:35
0:00:23
0:05:20
0:00:02
More than a few rows results in a large number that doesn't render properly. I've updated my answer.