Splunk Search

Adding a Field from a Subsearch Using Join - Inaccurate Results

Path Finder

I'm trying to add a field to my main search based on the values retrieved from a subsearch. More specifically, my main search finds all questions posted on my company's community website, and the subsearch finds all responses to questions. I'm trying to match each question (found from the main search) with its first response (found by sorting the responses by time and dedup'ing the subsearch based on a thread ID). However, this doesn't seem to be working properly. Here is my search:

... | join thread_object_id [search ... | dedup thread_object_id sortby +activity_ts | eval first_comment_ts=activity_ts | fields thread_object_id first_comment_ts] 

When I run this search, it pairs events from my main search with timestamps for the wrong comments (the firstcommentts is incorrect), as if the dedup thread_object_id sortby +activity_ts didn't sort it properly. In fact, it matches each thread_object_id with the first_comment_ts of the 2nd comment instead of the 1st.

When I run the subsearch on its own, everything looks fine. It only displays events that correspond to the first comment of each thread (first appearance of each thread_object_id when sorted by activity_ts, which is exactly what I want.

Here's an example event for a bit more clarification.

Subsearch result:
alt text
Main search result (after being joined with the subsearch):
alt text

Any ideas why the join command would be causing these inconsistencies? Or is there any way I could do this more efficiently, perhaps without using join? I'm sort of at a loss here.

Thanks.

Tags (3)

SplunkTrust
SplunkTrust

SplunkTrust
SplunkTrust

You could split your single field into two fields like this:

... | eval question_ts = case(searchmatch("question"), activity_ts) | eval comment_ts = case(searchmatch("comment"), activity_ts) | ...
0 Karma

SplunkTrust
SplunkTrust

That's no problem. Something along these lines should do:

sourcetype=foo (question OR comment) | eventstats earliest(activity_ts) as first_comment_ts by thread_object_id | search question

The initial search picks out both types of events, questions and comments. The eventstats copies the earliest occurrence of an activity_ts over to all events for each thread_object_id (note: I've assumed that questions don't have this field. If they do you need to do a bit of reshuffling to select the correct events.). The final search throws out the comments from the final results.

0 Karma

Path Finder

Hey, that looks like it could work. However, both questions and responses have the activityts field. How would you suggest I shuffle things around to account for this? Just for some clarification, I want to find the earliest activityts out of the response events, and use those results populate the firstcommentts field in the question events. I've tried to tinker with it but my solutions just seem overly convoluted.

Sorry for the late response. Thank you so much for your help so far.

0 Karma

Path Finder

The events in both the main search and the subsearch have the same sourcetype.

0 Karma

Motivator

Hello! add the type=inner to the join command. Some thing like this:

 ... | join type= inner thread_object_id [search ... | dedup thread_object_id sortby +activity_ts | eval first_comment_ts=activity_ts | fields thread_object_id first_comment_ts] 

Thanks

0 Karma

Path Finder

Yeah, I still get the same results.

0 Karma

SplunkTrust
SplunkTrust

Specifying type=inner won't change anything because that's the default setting.

0 Karma