Splunk Search

join with subsearch that has different field name, and so efficiently

Communicator

I would like to join search results with subsearch results, but I need to rename or define a new field name in order to tie one search to the other properly. Unfortunately, I can't seem to get the subsearch to use that new variable name.

First, the main search:

foo | eval join_id=parentsessionid

This finds all the foo, and the parent's session id. I name this parentsessionid as "join_id" because I want to use it to join with results from the parent session. Note that both "foo" and "bar" will have sessionid and parentsessionid fields - so I have to tread carefully, and I need to carefully check the field.

Now I want to join with a subsearch:

 | join join_id [ search bar | eval join_id=sessionid ]

This would seem, in theory, to join the two togther -- the "bar" information from the parent session with the "foo" information from the child.

In all, the search looks like this:

foo | eval join_id=parentsessionid
 | join join_id [ search bar | eval join_id=sessionid ]

Unfortunately, this does not work. The output is simply the result of a simple "foo" search, as if though the "bar" search never happened.

For those who prefer a real example,

2c0657b033a076d7df0e2b7d8d4288c7 (call_start OR connectionid) 
 | eval join_id=parentsessionid
 | join join_id [ search 34ec4840b397715e47d33304ba1b9be0 (session event connection.connected) 
 | eval join_id=sessionid ]

I also realize that this seems to hideously ineffective. I'm searching over a very large number of "bar" entries and then discarding almost all of them. I wouldn't mind a tip or two on how to make the search more efficient. But at present I'm more concerned about getting it to work in the first place.

Tags (3)
0 Karma

Legend

Generally it is wise to avoid join if possible. It's very expensive resource-wise and there's often (though of course not always) a smoother solution that's more suited for Splunk instead of being more suited for SQL. If you can find a set of eval statements that will create a join_id that comes from the parent session ID in the cases where you want that, and the current session ID where you want that, you could use transaction instead. It's admittedly somewhat resource consuming as well, but it's smoother and often makes more sense to use.

... | eval join_id=if(someconditionforparentsessionid, parentsessionid, sessionid) | transaction join_id
0 Karma

Communicator

Still not able to come up with a join; I tried subsearch, and that's not producing the results I expect either.

0 Karma

Communicator

Ayn,

Thanks for the suggestion. Unfortunately, I have been unable so far to come up with a transaction for this, which was my first choice. AFAICT I need the specific session ID info from the "foo" search first.

I'll try again, however, because your answer just gave me an idea for a new "if" that I haven't tried yet.

0 Karma

Communicator

I should mention that the output of each individual search is, in fact correct.

That is,

foo | eval join_id=parentsessionid | stats values(join_id)

produces a the expected result.

0 Karma