<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to join large tables with more than 50,000 rows in Splunk? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152136#M42639</link>
    <description>&lt;P&gt;How do you join large tables?&lt;/P&gt;

&lt;P&gt;It is impossible to join tables with more than 50k rows in splunk, so I'm using some tricks, and these tricks are extremely annoying.&lt;/P&gt;

&lt;P&gt;Is there any "normal way"?&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;So, to summarize:&lt;BR /&gt;
there is no "beautiful" way to join in splunk. We need to use stats, eventstats and brainfuck.&lt;/P&gt;</description>
    <pubDate>Tue, 22 Jul 2014 08:04:19 GMT</pubDate>
    <dc:creator>0range</dc:creator>
    <dc:date>2014-07-22T08:04:19Z</dc:date>
    <item>
      <title>How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152136#M42639</link>
      <description>&lt;P&gt;How do you join large tables?&lt;/P&gt;

&lt;P&gt;It is impossible to join tables with more than 50k rows in splunk, so I'm using some tricks, and these tricks are extremely annoying.&lt;/P&gt;

&lt;P&gt;Is there any "normal way"?&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;So, to summarize:&lt;BR /&gt;
there is no "beautiful" way to join in splunk. We need to use stats, eventstats and brainfuck.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2014 08:04:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152136#M42639</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-07-22T08:04:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152137#M42640</link>
      <description>&lt;P&gt;The Splunk way to join data is not to use &lt;CODE&gt;join&lt;/CODE&gt;. What's your use case?&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2014 08:26:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152137#M42640</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-07-22T08:26:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152138#M42641</link>
      <description>&lt;P&gt;And what to use instead of join?&lt;/P&gt;

&lt;P&gt;My case is joining some data &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; from different applications and hence different sources/sourcetypes/indexes.&lt;BR /&gt;
Several million of rows, at least.&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jul 2014 07:43:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152138#M42641</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-07-23T07:43:43Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152139#M42642</link>
      <description>&lt;P&gt;Is it possible to use STATS? &lt;/P&gt;

&lt;P&gt;(Example)&lt;BR /&gt;
index=index_A&lt;BR /&gt;
ID,NAME&lt;BR /&gt;
1,AAA&lt;BR /&gt;
2,BBB&lt;BR /&gt;
3,CCC&lt;BR /&gt;
4,DDD&lt;/P&gt;

&lt;P&gt;index=index_B&lt;BR /&gt;
ID,COUNT&lt;BR /&gt;
1,100&lt;BR /&gt;
2,200&lt;BR /&gt;
3,300&lt;BR /&gt;
4,400&lt;/P&gt;

&lt;P&gt;index=index_A OR index=index_B|stats first(NAME) as NAME,first(COUNT) as COUNT by ID&lt;/P&gt;

&lt;P&gt;ID,NAME,COUNT&lt;BR /&gt;
1,AAA,100&lt;BR /&gt;
2,BBB,200&lt;BR /&gt;
3,CCC,300&lt;BR /&gt;
4,DDD,400&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 17:08:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152139#M42642</guid>
      <dc:creator>HiroshiSatoh</dc:creator>
      <dc:date>2020-09-28T17:08:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152140#M42643</link>
      <description>&lt;P&gt;Here are a few options: &lt;A href="http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-join-append-or-use-of-subsearches"&gt;http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-join-append-or-use-of-subsearches&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jul 2014 09:53:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152140#M42643</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-07-23T09:53:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152141#M42644</link>
      <description>&lt;P&gt;ye, it's possible, but it is a little bit strange way.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2014 11:37:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152141#M42644</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-08-28T11:37:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152142#M42645</link>
      <description>&lt;P&gt;first problem: more than 2 indexes/tables&lt;BR /&gt;
second problem: different variables for different joins&lt;BR /&gt;
third problem: different names for the same variable&lt;/P&gt;

&lt;P&gt;when I haveto join three indexes A, B, C; and join A with B by id1 and B with C by id2 - it becomes MUCH more complicated. especially when the join-by-fields have different names in different indexes. 20 rows of and awful-formatting search request for such a simple thing... =(&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2014 11:38:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152142#M42645</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-08-28T11:38:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152143#M42646</link>
      <description>&lt;P&gt;I would encourage you not to use the &lt;STRONG&gt;join&lt;/STRONG&gt; command.  &lt;STRONG&gt;Join&lt;/STRONG&gt; is RDBMS thinking, but  Splunk works with data differently than an RDBMS does and most of the time &lt;STRONG&gt;join&lt;/STRONG&gt; is not needed, nor is it the best way to relate data.  If you see &lt;A href="http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-join-append-or-use-of-subsearches"&gt;this excellent post by MuS&lt;/A&gt;, he offers some much more efficient ways of searching across multiple tables (or sourctypes, or whatever it is that differentiates your data) without using &lt;STRONG&gt;join&lt;/STRONG&gt;.  To the problems that you mentioned:&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;first problem: more than 2 indexes/tables&lt;/EM&gt;:  This is no problem in Splunk.  Searching across multiple indexes/sourctypes is very easy, with no need to join for this operation.  I routinely search across multiple sourcetypes without needing to use join.&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Second problem:  different variables for different joins&lt;/EM&gt;:  We can address this once the details of the different variables for different joins are explained.&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Third problem:  different names for the same variable&lt;/EM&gt;:  Use &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CommonEvalFunctions"&gt;eval's&lt;/A&gt; &lt;STRONG&gt;coalesce&lt;/STRONG&gt; function to make it so that you only have to deal with a single variable name.&lt;/P&gt;

&lt;P&gt;If (like @martin_mueller asked) you could share more details of your use case or could share your search, we can help you write a better search.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2014 12:37:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152143#M42646</guid>
      <dc:creator>wpreston</dc:creator>
      <dc:date>2014-08-28T12:37:16Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152144#M42647</link>
      <description>&lt;P&gt;Ok, I'll try to show what do I really need now.&lt;BR /&gt;
I suppose not-inner join's become a little problem now.&lt;/P&gt;

&lt;P&gt;A. What I've got&lt;BR /&gt;
1) sourcetype = a | table _time, id&lt;BR /&gt;
2) sourcetype = b | table bid, cid&lt;BR /&gt;
3) sourcetype = c | table cid&lt;/P&gt;

&lt;P&gt;B. What I need&lt;BR /&gt;
earliest=-1d@d latest=@d a &lt;BR /&gt;
left join &lt;BR /&gt;
earliest=-1d@d b on a.id = b.bid &lt;BR /&gt;
left join &lt;BR /&gt;
earliest=-1d@d c on b.cid = c.cid&lt;/P&gt;

&lt;P&gt;and gt a table: a._time, a.id, b.bid, c.cid from this join&lt;/P&gt;

&lt;P&gt;...to be continued...&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2014 16:01:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152144#M42647</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-08-28T16:01:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152145#M42648</link>
      <description>&lt;P&gt;So.&lt;BR /&gt;
C. What I do&lt;BR /&gt;
(sourcetype=a OR sourcetype=b OR sourcetype=c)&lt;BR /&gt;
| eval bid = coalesce(bid, id, cid)&lt;BR /&gt;
| stats min(_time) as _time, max(cid) as cid by bid&lt;BR /&gt;
| eval bid = if(cid = bid, null, bid)&lt;BR /&gt;
| eval cid = coalesce(cid, bid)&lt;BR /&gt;
| where not (isnull(bid))&lt;BR /&gt;
| stats max(bid) as bid by cid&lt;BR /&gt;
| eval a = bid | eval c = if(cid = bid, null, cid)&lt;/P&gt;

&lt;P&gt;// and eval b - unknown... have no time now &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;It all looks like a tricky and time wasting thing &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; I think it's possible to improve&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2014 16:15:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152145#M42648</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-08-28T16:15:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152146#M42649</link>
      <description>&lt;P&gt;Try this&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(sourcetype=a OR sourcetype=b OR sourcetype=c)
| table sourcetype id bid cid | eventstats first(bid) as bid by cid 
| eval id=coalesce(id,"").coalesce(bid,"") 
| stats values(id) as id values(bid) as bid values(cid) as cid values(sourcetype) as sourcetype by id 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 28 Aug 2014 18:34:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152146#M42649</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2014-08-28T18:34:21Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152147#M42650</link>
      <description>&lt;P&gt;what does the point mean?&lt;BR /&gt;
| eval id=coalesce(id,"").coalesce(bid,"")&lt;/P&gt;</description>
      <pubDate>Fri, 29 Aug 2014 07:33:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152147#M42650</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-08-29T07:33:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152148#M42651</link>
      <description>&lt;P&gt;It should have been just |eval id=coalesce(id,bid) but my data for testing this was not having NULL but blank so used that. It basically takes first non-null value from id or bid and assign it to id.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Aug 2014 12:06:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152148#M42651</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2014-08-29T12:06:55Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152149#M42652</link>
      <description>&lt;P&gt;the description sounds like coalesce(id, bid, "")&lt;/P&gt;

&lt;P&gt;where can I read about the "." operator?&lt;/P&gt;</description>
      <pubDate>Fri, 29 Aug 2014 14:19:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152149#M42652</guid>
      <dc:creator>0range</dc:creator>
      <dc:date>2014-08-29T14:19:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152150#M42653</link>
      <description>&lt;P&gt;The dot operator concatenates two strings.&lt;/P&gt;

&lt;P&gt;Read "more" here: &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/Eval#Operators"&gt;http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/Eval#Operators&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 29 Aug 2014 15:02:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152150#M42653</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-08-29T15:02:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152151#M42654</link>
      <description>&lt;P&gt;This is bringing back a long-quiet post,  but reading this today I see nobody ever gave you a good stats search.  try this one.   I believe it addresses all the followup issues you identified in other answers. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=a OR sourcetype=b OR sourcetype=c
| eval id=if(sourcetype="b",bid,id)
| eval a_time=if(sourcetype="a",_time,null())
| stats values(cid) as cid values(a_time) as a_time by id
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 30 Mar 2016 19:01:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152151#M42654</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2016-03-30T19:01:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to join large tables with more than 50,000 rows in Splunk?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152152#M42655</link>
      <description>&lt;P&gt;I have a similar issue.&lt;BR /&gt;
What if my data is something like below:&lt;/P&gt;

&lt;P&gt;index=a &lt;BR /&gt;
left join filter (index=a join Q&lt;BR /&gt;
                                 (index=xyz|table Q)&lt;BR /&gt;
                       table filter&lt;BR /&gt;
                      )&lt;BR /&gt;
table *&lt;/P&gt;</description>
      <pubDate>Wed, 02 Jan 2019 15:11:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-join-large-tables-with-more-than-50-000-rows-in-Splunk/m-p/152152#M42655</guid>
      <dc:creator>srujan9292</dc:creator>
      <dc:date>2019-01-02T15:11:36Z</dc:date>
    </item>
  </channel>
</rss>

