<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Multiple index join with different formatted data JSON and RAW is not working in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210800#M96352</link>
    <description>&lt;P&gt;just in case it's this - you have a typo - Note the "CUSTOEMR_ORDER_NUMBER".  &lt;/P&gt;

&lt;P&gt;Also you don't need join for this and you can do it with just stats. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(index=esbetalog source=PRD (LINE_OF_BUSINESS_CD="R" OR LINE_OF_BUSINESS_CD="C")) OR (index=etaprd source=PRD) 
| table CUSTOMER_ORDER_NUMBER, ETA_FROM_TIME, ETA_TO_TIME, ARRIVECUSTOMERSTAMP, LINE_OF_BUSINESS_CD index 
| stats values(*) as * by CUSTOMER_ORDER_NUMBER 
| sort ETA_FROM_TIME
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It will run quite a lot faster this way without the join, and not be subject to various truncation limits of the join command.  The rename command seemed to be redundant so I removed it as well. &lt;/P&gt;

&lt;P&gt;Note: &lt;CODE&gt;values(*) as *&lt;/CODE&gt; should be used sparingly - note here I am careful to use the table command beforehand, so that it's only working with 5 fields.&lt;/P&gt;

&lt;P&gt;There might be a tweak or so needed - but try it out and I can update the answer as necessary.&lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2020 11:08:00 GMT</pubDate>
    <dc:creator>sideview</dc:creator>
    <dc:date>2020-09-29T11:08:00Z</dc:date>
    <item>
      <title>Multiple index join with different formatted data JSON and RAW is not working</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210798#M96350</link>
      <description>&lt;P&gt;I have esbetalog in JSON format and etaprd in RAW format and outer joined as with CUSTOMER_ORDER_NUMBER column&lt;BR /&gt;
both has same CUSTOMER_ORDER_NUMBER data but etaprd data is not coming in result&lt;BR /&gt;
etaprd  can have one or more _raw event data&lt;/P&gt;

&lt;P&gt;Sample query below,&lt;/P&gt;

&lt;P&gt;index=esbetalog  source=PRD  (LINE_OF_BUSINESS_CD="R" OR LINE_OF_BUSINESS_CD="C") | rename .CUSTOMER_ORDER_NUMBER as CUSTOEMR_ORDER_NUMBER | join type=left CUSTOMER_ORDER_NUMBER [search index=etaprd source=PRD ] |  table CUSTOMER_ORDER_NUMBER, ETA_FROM_TIME, ETA_TO_TIME, ARRIVECUSTOMERSTAMP, LINE_OF_BUSINESS_CD | sort ETA_FROM_TIME&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 11:07:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210798#M96350</guid>
      <dc:creator>ppanchal</dc:creator>
      <dc:date>2020-09-29T11:07:50Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple index join with different formatted data JSON and RAW is not working</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210799#M96351</link>
      <description>&lt;P&gt;For one, the name of the field is misspelled in your rename command &lt;CODE&gt;CUSTOEMR_ORDER_NUMBER&lt;/CODE&gt; should be &lt;CODE&gt;CUSTOMER_ORDER_NUMBER&lt;/CODE&gt;, causing your join to fail and not getting any data from etaprd index. Update that and try again.&lt;/P&gt;

&lt;P&gt;Further, joins are expensive, so give this alternative a try&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=esbetalog source=PRD (LINE_OF_BUSINESS_CD="R" OR LINE_OF_BUSINESS_CD="C")  OR (index=etaprd source=PRD)
| rename *CUSTOMER_ORDER_NUMBER as CUSTOMER_ORDER_NUMBER1
| eval CUSTOMER_ORDER_NUMBER=coalesce(CUSTOMER_ORDER_NUMBER,CUSTOMER_ORDER_NUMBER1)
| stats values(ETA_FROM_TIME) as ETA_FROM_TIME, values(ETA_TO_TIME) as ETA_TO_TIME , values(ARRIVECUSTOMERSTAMP) as ARRIVECUSTOMERSTAMP values(LINE_OF_BUSINESS_CD) as LINE_OF_BUSINESS_CD dc(index) as index by CUSTOMER_ORDER_NUMBER | where index=2 | fields - index
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 22 Sep 2016 18:49:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210799#M96351</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2016-09-22T18:49:13Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple index join with different formatted data JSON and RAW is not working</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210800#M96352</link>
      <description>&lt;P&gt;just in case it's this - you have a typo - Note the "CUSTOEMR_ORDER_NUMBER".  &lt;/P&gt;

&lt;P&gt;Also you don't need join for this and you can do it with just stats. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(index=esbetalog source=PRD (LINE_OF_BUSINESS_CD="R" OR LINE_OF_BUSINESS_CD="C")) OR (index=etaprd source=PRD) 
| table CUSTOMER_ORDER_NUMBER, ETA_FROM_TIME, ETA_TO_TIME, ARRIVECUSTOMERSTAMP, LINE_OF_BUSINESS_CD index 
| stats values(*) as * by CUSTOMER_ORDER_NUMBER 
| sort ETA_FROM_TIME
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It will run quite a lot faster this way without the join, and not be subject to various truncation limits of the join command.  The rename command seemed to be redundant so I removed it as well. &lt;/P&gt;

&lt;P&gt;Note: &lt;CODE&gt;values(*) as *&lt;/CODE&gt; should be used sparingly - note here I am careful to use the table command beforehand, so that it's only working with 5 fields.&lt;/P&gt;

&lt;P&gt;There might be a tweak or so needed - but try it out and I can update the answer as necessary.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 11:08:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210800#M96352</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2020-09-29T11:08:00Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple index join with different formatted data JSON and RAW is not working</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210801#M96353</link>
      <description>&lt;P&gt;I tried your search query as and it did give me some result as shown below.&lt;/P&gt;

&lt;P&gt;CUSTOMER_ORDER_NUMBER ARRIVECUSTOMERSTAMP    ETA_FROM_TIME       ETA_TO_TIME             LINE_OF_BUSINESS_CD    index&lt;BR /&gt;
3787312003        2016-09-22 03:45:14.0  2016-09-21 02:01:00.000 2016-09-22 02:46:00.000 C                          esbetalog &lt;BR /&gt;
                                             2016-09-21 05:01:00.000 2016-09-22 05:46:00.000                            etaprd   &lt;/P&gt;

&lt;P&gt;However, index esbetalog has multiple event for same CUSTOMER_ORDER_NUMBER.  What I need that index 'etaprd' values to be repeated for those many event 'esbetalog' matching index events. Table output should look like as One-To-Many join.  Another issue I have is that 'esbetalog' index will have 4k event at the most per day where as 'etaprd' index will have 2 million events per day.  Because of that it is downgrading the performance. I dont need to bring unnecessary eventdata from 'etaprd' except only matching CUSTOMER_ORDER_NUMBER data.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 11:08:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210801#M96353</guid>
      <dc:creator>ppanchal</dc:creator>
      <dc:date>2020-09-29T11:08:02Z</dc:date>
    </item>
    <item>
      <title>Re: Multiple index join with different formatted data JSON and RAW is not working</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210802#M96354</link>
      <description>&lt;P&gt;OK.  Great description, and I think I understand.  I also think I have to know which fields are coming from which side.   So here I've made an arbitrary assumption that it's &lt;CODE&gt;ETA_TO_TIME&lt;/CODE&gt; and and &lt;CODE&gt;ETA_FROM_TIME&lt;/CODE&gt; that are the fields only present in the sparser index=etaprd events.   Most likely this assumption is of course wrong, but you can see how that assumption is represented in the query below and modify it to match reality.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(index=esbetalog source=PRD (LINE_OF_BUSINESS_CD="R" OR LINE_OF_BUSINESS_CD="C")) OR (index=etaprd source=PRD) CUSTOMER_ORDER_NUMBER=*
| table CUSTOMER_ORDER_NUMBER, ETA_FROM_TIME, ETA_TO_TIME, ARRIVECUSTOMERSTAMP, LINE_OF_BUSINESS_CD index 
| eventstats values(ETA_TO_TIME) as ETA_TO_TIME values(ETA_FROM_TIME) as ETA_FROM_TIME by CUSTOMER_ORDER_NUMBER
| search index=esbetalog
| sort ETA_FROM_TIME
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;A couple things are changed here from my base answer. &lt;/P&gt;

&lt;P&gt;1) I snuck in another search term of &lt;CODE&gt;CUSTOMER_ORDER_NUMBER=*&lt;/CODE&gt;  This may or may not result in fewer events coming off disk, so you might want to add a couple other raw text terms that you know will be present in the right events. &lt;BR /&gt;
2) Using eventstats instead of stats.   This will not transform the rows, ie the output rows will be the same as the input rows, with some extra fields.   The specific effect of the eventstats command here will be to copy the ETA_TO_TIME and ETA_FROM_TIME from the index=etaprd events over onto the corresponding index=esbetalog events for each given CUSTOMER_ORDER_NUMBER&lt;BR /&gt;
3) An extra search command after the eventstats will filter down to just the esbetalog events. &lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 11:08:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Multiple-index-join-with-different-formatted-data-JSON-and-RAW/m-p/210802#M96354</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2020-09-29T11:08:13Z</dc:date>
    </item>
  </channel>
</rss>

