Splunk Search

Why is join slow?

chirsf
Explorer

I've seen a lot about not using join subsearches, how it's slow, etc etc. Which proves to be true in practice.

What I would like to find out is why it is slow. Any insight here would be helpful.

Tags (1)
0 Karma
1 Solution

nickhills
Ultra Champion

Take a look at these two articles, specifically the posts by @daljeanis :

https://answers.splunk.com/answers/561130/how-to-join-two-tables-where-the-key-is-named-diff.html
https://answers.splunk.com/answers/660008/which-is-the-best-approach-to-join-two-database-ta.html

The problem is that join is an SQL concept, and Splunk is not a relational database. The command exists (and works), but its very often not the best approach

If my comment helps, please give it a thumbs up!

View solution in original post

nickhills
Ultra Champion

Take a look at these two articles, specifically the posts by @daljeanis :

https://answers.splunk.com/answers/561130/how-to-join-two-tables-where-the-key-is-named-diff.html
https://answers.splunk.com/answers/660008/which-is-the-best-approach-to-join-two-database-ta.html

The problem is that join is an SQL concept, and Splunk is not a relational database. The command exists (and works), but its very often not the best approach

If my comment helps, please give it a thumbs up!

somesoni2
Revered Legend

I believe it's slow because of the algorithm and virtual memory the join command uses (it basically has to build a Cartesian product of two datasets and then work from there). With amount of processing and memory consumption often causes the join subsearches to timeout as well. If you've not read it alreadym, here is an excellent Splunk documentation on when to use join and when to use it's alternatives.

https://docs.splunk.com/Documentation/Splunk/7.2.4/Search/Abouteventcorrelation

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...