Hello I have a transaction query which I would like to optimize. It is impossible to run the query for a few hours. Is there anyway I can rephrase the query to have the same results?
Query: transaction Id1,Id2 startswith=login endswith=logout keepevicted=true
A unique event is mapped by combination of Id1 and Id2
I want to map all users who have logged in and logged out in the window
Also all users who have logged in but not logged out
And finally users who have logged out in the given time frame.
I am not sure how can I use subsearches or xyseries to optimize this query.
Can you paste your whole query? I strongly suggest users not to use the transaction command as it's non-streaming (i.e. forces the results to be pushed to the search head which loses the parallalism of Splunk).
index=shop sourcetype="shop1" (component=logout) OR (component=login) Id1!=NULL
|transaction Id1,Id2 startswith=login endswith=logout keepevicted=true
| eval ltime=relativetime(now(),"-15m")
| eval duration=if(match(api,"logout"),time-ltime,if(match(api,"login"),now()-max(time),duration))
| convert dur2sec(duration) AS duration
| search duration>360
|stats sum(duration) as watchtime, count(Id1) as totalIds by Id2
|eventstats sum(totalIds) as total
|eval totaltime =round(time/60,2),quoten=(totalIds/total)*100
|lookup userLookup Id2 as Id2 OUTPUT username as username
|table Id2, username,totaltime, total_Ids,quoten,average | sort -quoten
Transaction, while a powerful command, is also very resource-intensive. When possible, you should always try to use stats, eventstats or streamstats instead of transaction.
This method gets you what the transaction got you, but with much less machine cost.
index=shop sourcetype="shop1" (component=logout) OR (component=login) Id1!=NULL | fields _time Id1 Id2 api component | eval login_time=case(component="login",_time) | eval logout_time=case(component="logout",_time) | rename COMMENT as "Assign a login group number to each login for each Id combination" | sort 0 _time | streamstats count(login_time) as groupno by Id1 Id2 | rename COMMENT as "Calculate the duration of each group" | stats min(login_time) as login_time max(logout_time) as logout_time range(_time) as duration by Id1 Id2 groupno | rename COMMENT as "Set defaults for missing logins and logouts" | eval login_time=coalesce(login_time, some default start time of your choice...) | eval logout_time=coalesce(logout_time, some default end time of your choice...)
From that point, your logic should be pretty much the same.
Note - your logic shows both fields
api as being tested for values "login" and "logout". We assumed
component was correct.
I did step 14,15 (coalesce) before I did stats(step 11) so that he counts the evicted results as well. And instead of range I just use eval to substract time between logout and login. Somehow range was not able to calculate the correct duration. Thank you for your help.