Splunk Search

Adding a Field from a Subsearch Using Join - Inaccurate Results

rescobar713
Path Finder

I'm trying to add a field to my main search based on the values retrieved from a subsearch. More specifically, my main search finds all questions posted on my company's community website, and the subsearch finds all responses to questions. I'm trying to match each question (found from the main search) with its first response (found by sorting the responses by time and dedup'ing the subsearch based on a thread ID). However, this doesn't seem to be working properly. Here is my search:

... | join thread_object_id [search ... | dedup thread_object_id sortby +activity_ts | eval first_comment_ts=activity_ts | fields thread_object_id first_comment_ts] 

When I run this search, it pairs events from my main search with timestamps for the wrong comments (the first_comment_ts is incorrect), as if the dedup thread_object_id sortby +activity_ts didn't sort it properly. In fact, it matches each thread_object_id with the first_comment_ts of the 2nd comment instead of the 1st.

When I run the subsearch on its own, everything looks fine. It only displays events that correspond to the first comment of each thread (first appearance of each thread_object_id when sorted by activity_ts, which is exactly what I want.

Here's an example event for a bit more clarification.

Subsearch result:
alt text
Main search result (after being joined with the subsearch):
alt text

Any ideas why the join command would be causing these inconsistencies? Or is there any way I could do this more efficiently, perhaps without using join? I'm sort of at a loss here.

Thanks.

Tags (3)

martin_mueller
SplunkTrust
SplunkTrust

martin_mueller
SplunkTrust
SplunkTrust

You could split your single field into two fields like this:

... | eval question_ts = case(searchmatch("question"), activity_ts) | eval comment_ts = case(searchmatch("comment"), activity_ts) | ...
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

That's no problem. Something along these lines should do:

sourcetype=foo (question OR comment) | eventstats earliest(activity_ts) as first_comment_ts by thread_object_id | search question

The initial search picks out both types of events, questions and comments. The eventstats copies the earliest occurrence of an activity_ts over to all events for each thread_object_id (note: I've assumed that questions don't have this field. If they do you need to do a bit of reshuffling to select the correct events.). The final search throws out the comments from the final results.

0 Karma

rescobar713
Path Finder

Hey, that looks like it could work. However, both questions and responses have the activity_ts field. How would you suggest I shuffle things around to account for this? Just for some clarification, I want to find the earliest activity_ts out of the response events, and use those results populate the first_comment_ts field in the question events. I've tried to tinker with it but my solutions just seem overly convoluted.

Sorry for the late response. Thank you so much for your help so far.

0 Karma

rescobar713
Path Finder

The events in both the main search and the subsearch have the same sourcetype.

0 Karma

stephanefotso
Motivator

Hello! add the type=inner to the join command. Some thing like this:

 ... | join type= inner thread_object_id [search ... | dedup thread_object_id sortby +activity_ts | eval first_comment_ts=activity_ts | fields thread_object_id first_comment_ts] 

Thanks

SGF
0 Karma

rescobar713
Path Finder

Yeah, I still get the same results.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Specifying type=inner won't change anything because that's the default setting.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...