Think of this as a youtube scenario where I have 2 different indexes: viewerreport and videoreport. The viewerreport contains a column videoId, which I could use this Id to search for an event/video from videoreport index. A video could be watch by any user 1, 2, 100 times vary. The videoreport contains a column duration. How can I calculate the total duration of all the videos that users viewed?
Is this possible in splunk?
I have been trying this for 2 days and still can not come up with an answer. I can't think of a better way of doing this than for loop but so far I tried inputlookup, which doesn't actually fit the case. Thanks for the help!
The best thing to do is to put the video details into a lookup file with a true-up scheduled search like this:
index=videoreport
| dedup videoid
| table duration videoid
| appendpipe [|inputlookup VideosAndDurations.csv]
| dedup videoid
| outputlookup VideosAndDurations.csv
Then use it to lookup in the other search like this:
index = viewerreport
| stats count BY videoid user
| lookup VideosAndDurations videoid OUTPUT duration
| eval video_total = duration * count
| stats sum(video_total) AS grand_total BY user
Like this:
|mulisearch
[ search index=videoreport
| dedup videoid
| table duration videoid ]
[ search index=viewerreport
| stats count BY videoid user
| eval user_count = user . "=" . count
| table user_count videoid ]
| stats values(*) AS * BY videoid
| mvexpand user_count
| rex field=user_count "^(?<user>[^=]+)=(?<count>[^=]+)$"
| eval video_total = duration * count
| stats sum(video_total) AS grand_total BY user
Hi tamduong16,
if you want so sum durations of all IDs (field named "Id") in videoreport index it's easy:
index=videoreport
| stats sum(duration) AS Total_Duration BY Id
if instad you want to sum only IDs that are both in viewerreport and videoreport indexes, you could run this search
( I define that Id is named "Id" In videoreport and "videoId" in viewerreport, otherwise you have to rename fields in the correct way)
index=videoreport [ search index=viewerreport | rename videoId AS Id | dedup Id | fields Id ]
| stats sum(duration) AS Total_Duration BY Id
In this way, you use the subsearch (viewerreport index) to filter the main search (videoreport index) and then you can sum durations.
Bye.
Giuseppe