Well, I figured it out. The saved search I ended up with is:
| multisearch
[ search earliest=@d index=website_summary search_name="Website Sessions" ]
[ search earliest=@d index=apache_logs status=200 contenttype=text/html
| eval date=strftime(_time, "%Y-%m-%d")
| eval session_id=md5("".date.host_header.clientip.useragent ) ]
| stats earliest(_time) AS _time
earliest(referer) AS referer
earliest(uri) AS uri
first(search_name) AS search_name
BY session_id host_header clientip useragent
| where isnull(search_name)
Here was my logic leading up to this result:
I knew I couldn't use a subsearch, so I decided to try multisearch . This way, I won't have any problem with limits, but I knew that since both search results would be mixed together, I needed a way to filter out the duplicate session_ids.
Because multisearches only accept streaming commands, I couldn't use stats as part of the search, but I needed that because the session_id was derived from earliest(_time) . So I had to figure out a way to generate a session_id before stats. Instead of using _time, I created a date field with strftime(_time, "%Y-%m-%d") and used that instead. Perfect!
Now I just needed to figure out a way to remove the duplicates. By running stats on both result sets, I could add a field that I knew was included in the summary index but not the new result set. search_name seems to fit the bill, so I added first(search_name) to the stats, and then used a where isnull(search_name) to include just the new results.
... View more