<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Issue with parsing large dataset using Join in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-parsing-large-dataset-using-Join/m-p/479770#M134466</link>
    <description>&lt;P&gt;Hello,&lt;BR /&gt;
I am using the following search to parse 2 indexes since I want to combine the results from both indexes based on common field "email". I am running this search on my local Splunk instance and both indexes are uploaded CSV. I have configured limits.conf file to handle large dataset. I get different output for Clicked_link and delivered_email when I use the OR operator in the Join versus when I just use either of them I get the correct output.  Am I missing something here. Why is the OR operator trimming output result. I see 0 output for bunch of tables when normally it is populated with some number.&lt;/P&gt;

&lt;P&gt;index=IndexA&lt;BR /&gt;
| join type=inner email [ search index=IndexB ( event=delivered OR event=click ) | dedup email event | fields email, event ]&lt;BR /&gt;
| stats count(eval('event'="delivered")) as Email_Delivered&lt;BR /&gt;
count(eval('event'="click")) as Clicked_links&lt;BR /&gt;
by Region, Division, Country, Location&lt;BR /&gt;
| table Region, Division, Country, Location, "Email_Delivered" , Clicked_links&lt;/P&gt;</description>
    <pubDate>Wed, 30 Sep 2020 02:10:28 GMT</pubDate>
    <dc:creator>kiranpatil1985</dc:creator>
    <dc:date>2020-09-30T02:10:28Z</dc:date>
    <item>
      <title>Issue with parsing large dataset using Join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-parsing-large-dataset-using-Join/m-p/479770#M134466</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;
I am using the following search to parse 2 indexes since I want to combine the results from both indexes based on common field "email". I am running this search on my local Splunk instance and both indexes are uploaded CSV. I have configured limits.conf file to handle large dataset. I get different output for Clicked_link and delivered_email when I use the OR operator in the Join versus when I just use either of them I get the correct output.  Am I missing something here. Why is the OR operator trimming output result. I see 0 output for bunch of tables when normally it is populated with some number.&lt;/P&gt;

&lt;P&gt;index=IndexA&lt;BR /&gt;
| join type=inner email [ search index=IndexB ( event=delivered OR event=click ) | dedup email event | fields email, event ]&lt;BR /&gt;
| stats count(eval('event'="delivered")) as Email_Delivered&lt;BR /&gt;
count(eval('event'="click")) as Clicked_links&lt;BR /&gt;
by Region, Division, Country, Location&lt;BR /&gt;
| table Region, Division, Country, Location, "Email_Delivered" , Clicked_links&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 02:10:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-parsing-large-dataset-using-Join/m-p/479770#M134466</guid>
      <dc:creator>kiranpatil1985</dc:creator>
      <dc:date>2020-09-30T02:10:28Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with parsing large dataset using Join</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-parsing-large-dataset-using-Join/m-p/479771#M134467</link>
      <description>&lt;P&gt;Hi kiranpatil1985,&lt;BR /&gt;
there is a limit of 50,000 results in subsearches, for this reason and because join command is very very slow, I suggest to approach this problem in a different way, using stats command.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=IndexA OR index=IndexB ( event=delivered OR event=click ) 
| dedup email event | fields email, event ]
| stats count(eval('event'="delivered")) as Email_Delivered count(eval('event'="click")) as Clicked_links BY email Region, Division, Country, Location
| table Region, Division, Country, Location, "Email_Delivered" , Clicked_links
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Bye.&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
      <pubDate>Thu, 12 Sep 2019 08:51:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-parsing-large-dataset-using-Join/m-p/479771#M134467</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2019-09-12T08:51:58Z</dc:date>
    </item>
  </channel>
</rss>

