Hi All,
I'm trying to optimize the following search because it runs very slow. Looking for some help w/it. I've been exploring the multi-search command also, but cannot figure out how to get it working yet...
| union
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=ShowDeviceProgramOffer AND completion_code=0)
| join type=inner session
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=ShowBackupDeviceProgramOffer AND completed_from_ui=False AND completion_code=0)]
| join type=inner session
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=SetUserOnboardingComplete AND completion_code=0)]
| timechart span=5m dc(session) as total1]
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=ShowDeviceProgramOffer AND completion_code=0)
| join type=inner session
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=ShowBackupDeviceProgramOffer AND completed_from_ui=False AND completion_code=0)]
| join type=inner session
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=SetUserOnboardingComplete AND completion_code=0)]
| join type=inner session
[search index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset" (flow_name=UcdeDeviceOnboarding AND step_name=ShowUcdeActivationStatus AND completion_code=0)]
| timechart span=5m dc(session) as total2]
| timechart span=5m sum(total1) as eval1, sum(total2) as eval2
| eval ActivationFailed=eval1-eval2
| timechart span=5m sum(eval2) as "Accepted & SignedIn", sum(ActivationFailed) as "Activation Failed" partial=f
without knowing your data, it's a little tricky to give an exact solution, but values(*) as * will not include _time, so you can also do values(_time) as _time in the stats to retain _time.
To get rid of all those horrible subsearches, try this approach using a single search + stats to do the 'join' and then you can use eval to determine which category of your union it belongs to
Not totally sure what your end goal is, but this should demonstrate how to collapse many subsearches into a single search.
index=xmo2-aws host=xmo2-prod* sourcetype=oss_app_log "Step completion_code set/reset"
``` Combine all the searcg conditions into a single search command, looking for all permutations required ```
(flow_name=UcdeDeviceOnboarding AND
(step_name=ShowDeviceProgramOffer AND completion_code=0) OR
(step_name=ShowBackupDeviceProgramOffer AND completed_from_ui=False AND completion_code=0) OR
(step_name=SetUserOnboardingComplete AND completion_code=0) OR
(step_name=ShowUcdeActivationStatus AND completion_code=0)
)
``` Now aggregate all fields by the session and filter out those with less than 3 step names ```
| stats values(*) as * values(step_name) as step_names by session
| eval step_count=mvcount(step_names)
| where step_count < 3
``` Now if we have all 4 step names, then we are the total2 type ```
| eval session_type=if(step_count=4, 2, 1)
``` finally timechart by the session type above ```
| timechart span=5m dc(session) as sessions by session_type
``` rest of your search as needed ```
As @starcher indicates, using joins is likely to take you into a world of pain, where you don't realise you are missing data until someone tells you your numbers are wrong
this doesn't quite work because the conditions are mutually inclusive. The session has to meet all 3 of the first conditions. Then, if it meets the 4th condition, it is a success. If not, it is a failure.
I should have said, if it meets the 4th criteria as well as all 3 of the first criteria, then it is a success.
If I understand correctly, then all sessions containing all 4 steps should be included in the total1 count of sessions with 3 steps, so given the sessions
A=3, B=3, C=4, D=4, E=3
dc(session) for session_type 2 would = 2 (t2) and dc(session) for session_type 1 would be 3 (t1), so the effective dc(session) for any with 3 steps would be t1+t2
so after the timechart, do
| eval session_type=if(step_count=4, "t2", "t1")
| timechart span=5m dc(session) as sessions by session_type
| eval t1=t1+t2
does that work?
If not, can you give an example, so I can get my head around it better
still, the timechart command doesn't work without _time field, correct?
without knowing your data, it's a little tricky to give an exact solution, but values(*) as * will not include _time, so you can also do values(_time) as _time in the stats to retain _time.
I was able to get that to work. Thanks for the tips!
Great! Hope it actually made a performance difference 😁
Also, timechart doesn't work in your example because there is no _time field...
I would start here for reading. Use the guidance to gradually rebuild from scratch. Joins and sub searches introduce all sorts of limits. Even if yours ran faster it will likely miss things. https://conf.splunk.com/files/2020/slides/TRU1761C.pdf