The below table is for one User, like wise I have to pull the details for many users - who visited multiple url on different timestamp, I am trying to calculate the total duration between each url/E to url/J.
So what I am trying to achieve is whenever the user is visiting url/E and traversing till url/J - calculate the total duration. I trying using transaction command but it only calculates the last event of url/E and url/J
USER_ID. TIMESTAMP. URL
CD_125 | 05:30:36 | URL/E |
CD_!25 | 05:30:38 | URL/F |
CD_125 | 05:30:39 | URL/H |
CD_125 | 05:30:41 | URL/J |
CD_125 | 05:30:43 | URL/E |
CD_125 | 05:30:44 | URL/I |
CD_125 | 05:30:45 | URL/J |
what I am looking here is duration for each URL/E to URL/J . The output what I am expecting is this.
User_ID Duration URL
CD_125 | 5 | url/E |
CD_125 | 2 | url/E |
would appreciate if someone could guide and help me with the query. thanks
| sort 0 _time
| streamstats count(eval(url="URL/E")) as eCount global=f by user
| reverse
| streamstats count(eval(url="URL/J")) as jCount global=f by user eCount
| eventstats max(jCount) as jMax by user eCount
| eval jeCount=if(jMax>0 AND eCount>0,jMax-jCount+1,0)
| reverse
| where jeCount=1
| sort 0 user _time
| stats range(_time) as duration list(url) as url by user eCount
@ITWhisperer I was trying to pull the details for last 15mins for which it displays only 7 events and when I did a normal stats count it shows more than 100+.
Not sure if this query can be tweaked further.
if you can help me why the "reverse" command was used twice. Thanks
I am not sure what the issue is here - can you share the events or at least an example of the events you think are not being counted when they should be?
Splunk SPL essentially works on a stream or pipeline of events. The streamstats command builds aggregations as it processes the pipeline. The first sort puts the events into ascending time order i.e. earliest first, since this seems to be important for you use case.
It can then find the URL/E.
The first reverse changes the order of the events in the pipeline (similar to sort 0 -_time but quicker!) so that the URL/J can be found
The second reverse restores the order of events in the pipeline to time order. To be fair, this is probably redundant since order is restored by the second sort.
@ITWhisperer thanks
Thank you so much @ITWhisperer , it worked perfectly.
Much Appreciated.
| sort 0 _time
| streamstats count(eval(url="URL/E")) as eCount global=f by user
| reverse
| streamstats count(eval(url="URL/J")) as jCount global=f by user eCount
| eventstats max(jCount) as jMax by user eCount
| eval jeCount=if(jMax>0 AND eCount>0,jMax-jCount+1,0)
| reverse
| where jeCount=1
| sort 0 user _time
| stats range(_time) as duration list(url) as url by user eCount
@ITWhisperer
why do we have a condition like
| where jeCount = 1
is it only for when url/E is just once for each transaction
If URL/J occurs for the user again before the next URL/E, jeCount can become 2 etc. so the condition avoids this situation.