Splunk Search

Outer Join in Splunk

afurze
Engager

All,

I've been trying to find a solution for this for a few days.  We have multiple tools sending data in on their coverage and we would like to have a search that will show hosts which exist in one but not the other, in SQL terms, an OUTER JOIN.

I have found that Splunk doesn't support a true outer join, so I'm still searching for a solution.

Edit: spelling

Labels (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @afurze,

Splunk isn't a db!

it has the join command but it's better to avoid it because it's very slow!

You could have two approaches:

  1. if you haven't many events from the second host (less than 50,000) you can use the subsearch;
  2. if you have many events (more than 50,000) you have to use stats.

so if you have events in different indexes (index_A or index_B)

in the first case:

index=index_A NOT [ search index=index_B | dedup host | fields host]
| ...

In the second case:

index=index_ OR index=index_B
| stats dc(index) AS dc_index values(index) AS index BY host
| where dc_index=1 AND index=index_A

If you have your data all in the same index, you have to separate events using the sourcetype or another field.

Ciao.

Giuseppe

 

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @afurze,

Splunk isn't a db!

it has the join command but it's better to avoid it because it's very slow!

You could have two approaches:

  1. if you haven't many events from the second host (less than 50,000) you can use the subsearch;
  2. if you have many events (more than 50,000) you have to use stats.

so if you have events in different indexes (index_A or index_B)

in the first case:

index=index_A NOT [ search index=index_B | dedup host | fields host]
| ...

In the second case:

index=index_ OR index=index_B
| stats dc(index) AS dc_index values(index) AS index BY host
| where dc_index=1 AND index=index_A

If you have your data all in the same index, you have to separate events using the sourcetype or another field.

Ciao.

Giuseppe

 

afurze
Engager

This is exactly what I need, thanks!

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...