Splunk Enterprise

Splunk Join command basics / newbie examples

inventsekar
Ultra Champion

Hi All...

For those who already know some SQL, the join commands are pretty easy. Some of my teammates who are non-sql members, they were not aware of join, and when they try to read docs, they could not understand easily. Hence i thought to create this post for all. Thanks.

Labels (1)
Tags (3)
0 Karma
1 Solution

inventsekar
Ultra Champion

Lets take 2 simple files:

ubuntu@sekar:~$ more /tmp/names1
name=a
name=b
name=c
name=e
name=f
ubuntu@sekar:~$ more /tmp/names2
name=d
name=f
name=g
name=h
name=i
ubuntu@sekar:~$  

i uploaded these 2 files and used the join command:

Join

1. inner join example: (inner join is the default join method):

join-inner.png

2. left join example:

join-left.png

3. outer join example:

join-outer.png

View solution in original post

inventsekar
Ultra Champion

Lets take 2 simple files:

ubuntu@sekar:~$ more /tmp/names1
name=a
name=b
name=c
name=e
name=f
ubuntu@sekar:~$ more /tmp/names2
name=d
name=f
name=g
name=h
name=i
ubuntu@sekar:~$  

i uploaded these 2 files and used the join command:

Join

1. inner join example: (inner join is the default join method):

join-inner.png

2. left join example:

join-left.png

3. outer join example:

join-outer.png

inventsekar
Ultra Champion

Accepting the above as solution.. 

Please reply your views, karma points 😉

0 Karma

inventsekar
Ultra Champion

Hi All, 

the splunk left join and outer join - both are same ah?!?!

0 Karma

to4kawa
Ultra Champion

Descriptions for the join-options

argument

type

Syntax: type=inner | outer | left

Description: Indicates the type of join to perform. The difference between an inner and a left (or outer) join is how the events are treated in the main search that do not match any of the events in the subsearch. In both inner and left joins, events that match are joined. The results of an inner join do not include events from the main search that have no matches in the subsearch. The results of a left (or outer) join includes all of the events in the main search and only those values in the subsearch have matching field values.

Default: inner

https://docs.splunk.com/Documentation/Splunk/8.0.4/SearchReference/Join

I think both are the same.

bowesmana
SplunkTrust
SplunkTrust

It's worth pointing out in any Splunk discussion of join that there are some hidden pitfalls that can be hard to detect with large data sets, particularly around the default subsearch data set sizes and search time length.

I find that SQL devs coming to Splunk will always try to skin the cat with a join and then increase limits when things don't work.

The alternative commands section at the top is a good starting point and I have found it really useful to use stats as a starting point to combine multiple disparate data sets. I've generally found it faster than the join and for really large data sets, join just will not work in any reasonable time frame.

That's not to say that join doesn't have a use, but it should rarely be the go-to command for a join type operation. Working out how to do it the stats way gives you a better understand of the data/pipeline flow in SPL.

 

Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...