<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merging 3 interlinked large data sets with different ref keys in two individual merges in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294108#M88796</link>
    <description>&lt;P&gt;This works ! Thank for the new perspective. I will post the final results more once I do a further analysis on Monday&lt;/P&gt;</description>
    <pubDate>Sat, 19 Aug 2017 11:37:38 GMT</pubDate>
    <dc:creator>splunk4now</dc:creator>
    <dc:date>2017-08-19T11:37:38Z</dc:date>
    <item>
      <title>Merging 3 interlinked large data sets with different ref keys in two individual merges</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294104#M88792</link>
      <description>&lt;P&gt;I have 3 data sets (say src1, src2, sr3), with merged resultsets of single merge greater than the 50k limit - hence normal are ruled out. In this case src1 references a field in src2 and src2 references a different field in src3.  The objective is to merge across all three data sets based on certain conditions in two individual merges (src1 &amp;amp; src2 ; sr2 &amp;amp; src3). I managed to get two sources merged, however merging the third seems perplexing due to limitations on sub-search or join over the stats results. Any thoughts ? &lt;/P&gt;

&lt;P&gt;Code for merging two sources:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="srctype:source1" OR sourcetype="srctype:source2"
 | eval new_gid = coalesce(ref_src1,ref_src2)
 | stats list(field1_src1) as field1_src1 
         list(field2_src1) as field2_src1
         list(field3_src1) as field3_src1
         list(field1_src2) as field1_src2 
         list(field2_src2) as field2_src2
   by new_gid
 | where field3_src1="cat1" OR field3_src1="cat2"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;A single stats result row for the above search would look something like this.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;new_gid               field1_src1    field2_src1    field3_src1    field1_src2    field2_src2
03759542db9fb2404     3958294        5              cat1           69.13          c2bf762edb7b1600ab
                                                                   266.7          7f4f9f7adb44e600b1
                                                                   0.01           7f4f9f7adb44e600b1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This result-set would have to be merged with src_3 using field2_sr2, so that only one value remains instead of three (based on values of certain fields in src3). &lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 15:27:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294104#M88792</guid>
      <dc:creator>splunk4now</dc:creator>
      <dc:date>2020-09-29T15:27:13Z</dc:date>
    </item>
    <item>
      <title>Re: Merging 3 interlinked large data sets with different ref keys in two individual merges</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294105#M88793</link>
      <description>&lt;P&gt;I'm confused about your "limitations on sub-search or join" comment since your query contains neither a subsearch nor a join.&lt;BR /&gt;
What happens when you run this query?&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="srctype:source1" OR sourcetype="srctype:source2" OR sourcetype="srctype:source3"
  | eval new_gid = coalesce(ref_src1,ref_src2,ref_src3)
  | stats list(field1_src1) as field1_src1 
          list(field2_src1) as field2_src1
          list(field3_src1) as field3_src1
          list(field1_src2) as field1_src2 
          list(field2_src2) as field2_src2
          list(field1_src3) as field1_src3
          list(field2_src3) as field2_src3
    by new_gid
  | where field3_src1="cat1" OR field3_src1="cat2" OR field3_src1="cat3"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 18 Aug 2017 13:34:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294105#M88793</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2017-08-18T13:34:30Z</dc:date>
    </item>
    <item>
      <title>Re: Merging 3 interlinked large data sets with different ref keys in two individual merges</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294106#M88794</link>
      <description>&lt;P&gt;Start with getting ONLY the relevant records from all files.  Do the limits on source 3 up front.  &lt;/P&gt;

&lt;P&gt;Use &lt;CODE&gt;eventstats&lt;/CODE&gt; to roll the data from the farthest-right file and then kill the file, one file at a time, until you're ready for &lt;CODE&gt;stats&lt;/CODE&gt;.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;   (sourcetype="S1" (field3_src1="cat1" OR field3_src1="cat2") )
OR  sourcetype="S2" 
OR (sourcetype="S3" (field1_src3="bar" OR field2_src3="baz") )

| rename COMMENT as "now we limit to the relevant fields"
| fields sourcetype ref_src1 field1_src1 field2_src1 field3_src1 ref_src2 field1_src2 ref_src3 field2_src2 field1_src3 field2_src3

| rename COMMENT as "roll the data from source3 onto source2, then drop all source3 and any source2 that have no match "
| eval ref_src3=coalesce(ref_src3,field2_src2)
| eventstats values(field1_src3) as field1_src3, values(field2_src3) as field2_src3 by ref_src3
| where sourcetype="S1" OR ( sourcetype="S2" AND isnotnull(field1_src3))

| rename COMMENT as "roll all the remaining data together now"
| eval new_gid = coalesce(ref_src1,ref_src2)
| stats list(*) as * by new_gid
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 18 Aug 2017 18:40:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294106#M88794</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-08-18T18:40:02Z</dc:date>
    </item>
    <item>
      <title>Re: Merging 3 interlinked large data sets with different ref keys in two individual merges</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294107#M88795</link>
      <description>&lt;P&gt;I meant the imposed limitations on join/sub-search resultset and I was trying out a mechanism to merge  "resultset of sr1 and src2" with src3. The above search query (which i had already tried earlier) does not provide any values for field1/field2 from src3, since I assume that coalesce function is expecting for a same key value in all 3 sources. In this case src1/src2 share a common reference key and src2/sr3 share a different common reference key.&lt;/P&gt;</description>
      <pubDate>Fri, 18 Aug 2017 19:46:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294107#M88795</guid>
      <dc:creator>splunk4now</dc:creator>
      <dc:date>2017-08-18T19:46:32Z</dc:date>
    </item>
    <item>
      <title>Re: Merging 3 interlinked large data sets with different ref keys in two individual merges</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294108#M88796</link>
      <description>&lt;P&gt;This works ! Thank for the new perspective. I will post the final results more once I do a further analysis on Monday&lt;/P&gt;</description>
      <pubDate>Sat, 19 Aug 2017 11:37:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Merging-3-interlinked-large-data-sets-with-different-ref-keys-in/m-p/294108#M88796</guid>
      <dc:creator>splunk4now</dc:creator>
      <dc:date>2017-08-19T11:37:38Z</dc:date>
    </item>
  </channel>
</rss>

