I'm trying to add a field to my main search based on the values retrieved from a subsearch. More specifically, my main search finds all questions posted on my company's community website, and the subsearch finds all responses to questions. I'm trying to match each question (found from the main search) with its first response (found by sorting the responses by time and dedup'ing the subsearch based on a thread ID). However, this doesn't seem to be working properly. Here is my search:
... | join thread_object_id [search ... | dedup thread_object_id sortby +activity_ts | eval first_comment_ts=activity_ts | fields thread_object_id first_comment_ts]
When I run this search, it pairs events from my main search with timestamps for the wrong comments (the first_comment_ts is incorrect), as if the dedup thread_object_id sortby +activity_ts
didn't sort it properly. In fact, it matches each thread_object_id
with the first_comment_ts
of the 2nd comment instead of the 1st.
When I run the subsearch on its own, everything looks fine. It only displays events that correspond to the first comment of each thread (first appearance of each thread_object_id
when sorted by activity_ts
, which is exactly what I want.
Here's an example event for a bit more clarification.
Subsearch result:
Main search result (after being joined with the subsearch):
Any ideas why the join
command would be causing these inconsistencies? Or is there any way I could do this more efficiently, perhaps without using join
? I'm sort of at a loss here.
Thanks.
Consider using stats
to merge your two sourcetypes.
You could split your single field into two fields like this:
... | eval question_ts = case(searchmatch("question"), activity_ts) | eval comment_ts = case(searchmatch("comment"), activity_ts) | ...
That's no problem. Something along these lines should do:
sourcetype=foo (question OR comment) | eventstats earliest(activity_ts) as first_comment_ts by thread_object_id | search question
The initial search
picks out both types of events, questions and comments. The eventstats
copies the earliest occurrence of an activity_ts
over to all events for each thread_object_id
(note: I've assumed that questions don't have this field. If they do you need to do a bit of reshuffling to select the correct events.). The final search
throws out the comments from the final results.
Hey, that looks like it could work. However, both questions and responses have the activity_ts field. How would you suggest I shuffle things around to account for this? Just for some clarification, I want to find the earliest activity_ts out of the response events, and use those results populate the first_comment_ts field in the question events. I've tried to tinker with it but my solutions just seem overly convoluted.
Sorry for the late response. Thank you so much for your help so far.
The events in both the main search and the subsearch have the same sourcetype.
Hello! add the type=inner
to the join command. Some thing like this:
... | join type= inner thread_object_id [search ... | dedup thread_object_id sortby +activity_ts | eval first_comment_ts=activity_ts | fields thread_object_id first_comment_ts]
Thanks
Yeah, I still get the same results.
Specifying type=inner
won't change anything because that's the default setting.