Think of this as a youtube scenario where I have 2 different indexes: viewerreport and videoreport. The viewerreport contains a column videoId, which I could use this Id to search for an event/video from videoreport index. A video could be watch by any user 1, 2, 100 times vary. The videoreport contains a column duration. How can I calculate the total duration of all the videos that users viewed?
Is this possible in splunk?
I have been trying this for 2 days and still can not come up with an answer. I can't think of a better way of doing this than for loop but so far I tried inputlookup, which doesn't actually fit the case. Thanks for the help!
The best thing to do is to put the video details into a lookup file with a true-up scheduled search like this:
index=videoreport | dedup videoid | table duration videoid | appendpipe [|inputlookup VideosAndDurations.csv] | dedup videoid | outputlookup VideosAndDurations.csv
Then use it to lookup in the other search like this:
index = viewerreport | stats count BY videoid user | lookup VideosAndDurations videoid OUTPUT duration | eval video_total = duration * count | stats sum(video_total) AS grand_total BY user
|mulisearch [ search index=videoreport | dedup videoid | table duration videoid ] [ search index=viewerreport | stats count BY videoid user | eval user_count = user . "=" . count | table user_count videoid ] | stats values(*) AS * BY videoid | mvexpand user_count | rex field=user_count "^(?<user>[^=]+)=(?<count>[^=]+)$" | eval video_total = duration * count | stats sum(video_total) AS grand_total BY user
if you want so sum durations of all IDs (field named "Id") in videoreport index it's easy:
index=videoreport | stats sum(duration) AS Total_Duration BY Id
if instad you want to sum only IDs that are both in viewerreport and videoreport indexes, you could run this search
( I define that Id is named "Id" In videoreport and "videoId" in viewerreport, otherwise you have to rename fields in the correct way)
index=videoreport [ search index=viewerreport | rename videoId AS Id | dedup Id | fields Id ] | stats sum(duration) AS Total_Duration BY Id
In this way, you use the subsearch (viewerreport index) to filter the main search (videoreport index) and then you can sum durations.