Solved: Splunk Join command basics / newbie examples

inventsekar · ‎06-16-2020

Hi All...

For those who already know some SQL, the join commands are pretty easy. Some of my teammates who are non-sql members, they were not aware of join, and when they try to read docs, they could not understand easily. Hence i thought to create this post for all. Thanks.

inventsekar · ‎06-16-2020

Lets take 2 simple files:

ubuntu@sekar:~$ more /tmp/names1
name=a
name=b
name=c
name=e
name=f
ubuntu@sekar:~$ more /tmp/names2
name=d
name=f
name=g
name=h
name=i
ubuntu@sekar:~$

i uploaded these 2 files and used the join command:

1. inner join example: (inner join is the default join method):

2. left join example:

3. outer join example:

View solution in original post

inventsekar · ‎06-16-2020

Lets take 2 simple files:

ubuntu@sekar:~$ more /tmp/names1
name=a
name=b
name=c
name=e
name=f
ubuntu@sekar:~$ more /tmp/names2
name=d
name=f
name=g
name=h
name=i
ubuntu@sekar:~$

i uploaded these 2 files and used the join command:

1. inner join example: (inner join is the default join method):

2. left join example:

3. outer join example:

inventsekar · ‎06-18-2020

Accepting the above as solution..

Please reply your views, karma points 😉

inventsekar · ‎07-04-2020

Hi All,

the splunk left join and outer join - both are same ah?!?!

to4kawa · ‎07-04-2020

Descriptions for the join-options

argument

type

Syntax: type=inner | outer | left

Description: Indicates the type of join to perform. The difference between an inner and a left (or outer) join is how the events are treated in the main search that do not match any of the events in the subsearch. In both inner and left joins, events that match are joined. The results of an inner join do not include events from the main search that have no matches in the subsearch. The results of a left (or outer) join includes all of the events in the main search and only those values in the subsearch have matching field values.

Default: inner

https://docs.splunk.com/Documentation/Splunk/8.0.4/SearchReference/Join

I think both are the same.

bowesmana · ‎07-05-2020

It's worth pointing out in any Splunk discussion of join that there are some hidden pitfalls that can be hard to detect with large data sets, particularly around the default subsearch data set sizes and search time length.

I find that SQL devs coming to Splunk will always try to skin the cat with a join and then increase limits when things don't work.

The alternative commands section at the top is a good starting point and I have found it really useful to use stats as a starting point to combine multiple disparate data sets. I've generally found it faster than the join and for really large data sets, join just will not work in any reasonable time frame.

That's not to say that join doesn't have a use, but it should rarely be the go-to command for a join type operation. Working out how to do it the stats way gives you a better understand of the data/pipeline flow in SPL.

Splunk Join command basics / newbie examples

administration

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Adoption of RUM and APM at Splunk