I've seen it suggested before and definitely have witnessed myself that for searches involving any significant amount of data, it's always light years faster to grab all the data and then figure out a way to correlate it at a later time via stats, versus using a subsearch in your base query. To illustrate what I mean, say for example you have two sourcetypes "left" and "right", each containing their own set of data that has a shared unique identifier that can correlate the data we'll call "unique_id". So why does a search like this:
the sub-search in your first example, is fast when the output rows in your subsearch is less or low in numbers (eg less than 100)
In our testing, the parsing of the sub-search also takes time as Splunk takes time to "get all the values" before proceeding further.
The subsearch is in square brackets and is run first. You also need to see the "expanded search" in the "Job" tab to see how the results are then passed as key-value fields to the outer search (You can see how complex sub-search is !!). This becomes too time consuming when the results of subsearch exceed 1000 rows.