Splunk Search

can the join command only give me inner joins and left outer joins?

sideview
SplunkTrust
SplunkTrust

I actually need a right join in some cases.

I know im not supposed to use joins at all, and wherever possible use a disjunction plus stats, or use a lookup, because these are faster, better, cheaper, awesomer etc.
(http://answers.splunk.com/questions/822/simulating-a-sql-join-in-splunk/1717#1717)

However sometimes there's just no other way. One side of my data comes from a search and the other side comes from inputlookup. So i cant just glue together two sets of 'events' in a single search with some OR's and stitch them back together with stats count by foo later. .

Anyway, proceeding with join, I have 2 searches that return events with a field called sourceHost

search <somewhat expensive search>
inputlookup foo.csv

Following best practices with join, the cheaper smaller search goes inside the brackets:

search <somewhat expensive search> | join common_field [inputlookup foo.csv]

"inner" is the default type, so if rows are in left side and not in the right side, or in the right side and not on the left side, they'll be dropped.

the docs say that type="outer" and type="left" are synonymous. http://www.splunk.com/base/Documentation/latest/SearchReference/Join

and as far as I can tell there's no type="right". Is there another way?

Tags (2)
0 Karma
1 Solution

sideview
SplunkTrust
SplunkTrust

One workaround is to use the append command and then much like the disjunction trick (with the OR's) you glue the two sets back together with a stats. Say you have a field called 'lastConnected' in your lookup table, and some stats coming back in your outer search like sum(kb) you can do this.

 search foo | stats sum(kb) as KB by sourceHost | append [inputlookup foo.csv] | stats first(KB) as KB first(lastConnected) as lastConnected by sourceHost

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

Yeah. technically our "outer" is "left outer", not "right outer". There is also no "full".

0 Karma

sideview
SplunkTrust
SplunkTrust

One workaround is to use the append command and then much like the disjunction trick (with the OR's) you glue the two sets back together with a stats. Say you have a field called 'lastConnected' in your lookup table, and some stats coming back in your outer search like sum(kb) you can do this.

 search foo | stats sum(kb) as KB by sourceHost | append [inputlookup foo.csv] | stats first(KB) as KB first(lastConnected) as lastConnected by sourceHost

sideview
SplunkTrust
SplunkTrust

oh nice! yea that's much better. Were you to write it up as an answer I'd probably pick it and delete/merge my own. 😃

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

You can also avoid using a subsearch entirely in this case (along with it's attendant size and time limits etc) by using the append=t option on | inputlookup instead: search foo | stats sum(kb) as KB by sourceHost | inputlookup append=t foo.csv | stats first(KB) lastConnected by sourceHost

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...