Splunk Search

Using field option in join command

Path Finder

If I get a search like below:

index="main" ~~~~~ | table _time value code | join type=outer [search index="main" ~~~~  | table _time value]
| table _time value code

Should I fill in the fields(_time, value) like this, | join type=outer _time value [~~].
I know if the field option is blank, all of the main and sub-search result fields join, is it right?
Thanks.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

You didn't supply a "what I hope to get out of this" for the code you posted, so it's hard to be sure about an answer.

One thing I can say is that I see no reason to do a join. The fine documentation does everything it can to steer you away from join, because it's very likely the single worst performing command in Splunk.

A better option in this case (there's actually several - this isn't the only way by ANY stretch of the imagination!) might be to use append...

index="main" ... | table _time value code | append [search index="main" ... | table _time value] | table _time value code

But there's a lot more that could be done.

There's no need to table it three times, all that could be simplified to

index="main" ... | append [search index="main" ... ] | table _time value code

But even more importantly, what are we even doing? If it's trying to blend together two "sets" of data, the Splunk way is to just dump them all into a big pile and stats them into behaving - if a stats is even necessary!

Try running this and just see what it is you get:

index="main" (...) OR (...) | table _time value code

the (...) OR (...) is both of your criteria. You may need additional parentheses, for good form I'd go with ((...) OR (...)), but it seemed more clear to write it the other way as an example.

If that's got duplication, then use stats to fix that.

index="main" (...) OR (...) | stats count by value code

If you don't like the count in there, remove it with the fields command.

index="main" (...) OR (...) | stats count by value code | fields - count

Anyway, I hope this helps. The resulting search - whatever it is that you come up with - will be undeniably better if it is even possible to avoid join and you do avoid it. There precious few times when a join is unavoidable. For instance, I've been doing this since 2014 on complicated data sets and haven't used join even once (well, except in training, and as a few tests once). And I'm originally from a SQL background!

Happy Splunking,
Rich

View solution in original post

SplunkTrust
SplunkTrust

You didn't supply a "what I hope to get out of this" for the code you posted, so it's hard to be sure about an answer.

One thing I can say is that I see no reason to do a join. The fine documentation does everything it can to steer you away from join, because it's very likely the single worst performing command in Splunk.

A better option in this case (there's actually several - this isn't the only way by ANY stretch of the imagination!) might be to use append...

index="main" ... | table _time value code | append [search index="main" ... | table _time value] | table _time value code

But there's a lot more that could be done.

There's no need to table it three times, all that could be simplified to

index="main" ... | append [search index="main" ... ] | table _time value code

But even more importantly, what are we even doing? If it's trying to blend together two "sets" of data, the Splunk way is to just dump them all into a big pile and stats them into behaving - if a stats is even necessary!

Try running this and just see what it is you get:

index="main" (...) OR (...) | table _time value code

the (...) OR (...) is both of your criteria. You may need additional parentheses, for good form I'd go with ((...) OR (...)), but it seemed more clear to write it the other way as an example.

If that's got duplication, then use stats to fix that.

index="main" (...) OR (...) | stats count by value code

If you don't like the count in there, remove it with the fields command.

index="main" (...) OR (...) | stats count by value code | fields - count

Anyway, I hope this helps. The resulting search - whatever it is that you come up with - will be undeniably better if it is even possible to avoid join and you do avoid it. There precious few times when a join is unavoidable. For instance, I've been doing this since 2014 on complicated data sets and haven't used join even once (well, except in training, and as a few tests once). And I'm originally from a SQL background!

Happy Splunking,
Rich

View solution in original post

SplunkTrust
SplunkTrust

The field(s) to use in the join are those that are present in both sides of the join and tell Splunk which events on each side are related. For example, join type=outer system [...] will combine events with the same system name.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

Motivator

Could you please explain your goal here? What you have and what you want to achieve?

0 Karma