<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Union / Intersect results from different source type in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25466#M4146</link>
    <description>&lt;P&gt;Makes sense; however, if the two different sourcetypes have different fields (say "url1" and "url2") that I want to union into one field, what's the best way to do that?  &lt;/P&gt;

&lt;P&gt;I'm trying to avoid this delimiter solution below if possible...&lt;BR /&gt;
yoursearchhere |&lt;BR /&gt;
eval output = field1 + ";" + field2 |&lt;BR /&gt;
makemv delim=";" output |&lt;BR /&gt;
mvexpand output&lt;/P&gt;</description>
    <pubDate>Mon, 08 Oct 2012 20:22:55 GMT</pubDate>
    <dc:creator>e_sherlock</dc:creator>
    <dc:date>2012-10-08T20:22:55Z</dc:date>
    <item>
      <title>Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25460#M4140</link>
      <description>&lt;P&gt;Hi - I'm trying to union/intersect results from different source type using the SET command:&lt;/P&gt;

&lt;P&gt;set union [search sourcetype="first_source" 404 | fields url] [search sourcetype="second_source" 303 | fields url]&lt;/P&gt;

&lt;P&gt;However, it's not returning any results, even though if I run the search separately, it will return the results that I want.  &lt;/P&gt;

&lt;P&gt;Does anyone have experience with the SET command?  The documentation for the SET command doesn't really have any details explanations.  If it's not possible to union/intersect results from different source type, what would be a good way to correlate the data?&lt;/P&gt;

&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 05 Aug 2010 03:18:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25460#M4140</guid>
      <dc:creator>clincg</dc:creator>
      <dc:date>2010-08-05T03:18:29Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25461#M4141</link>
      <description>&lt;P&gt;The set command needs a leading pipe to distinguish it as a search command and not a search term...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| set union [search sourcetype="first_source" 404 | fields url] [search sourcetype="second_source" 303 | fields url]
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 05 Aug 2010 03:39:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25461#M4141</guid>
      <dc:creator>bwooden</dc:creator>
      <dc:date>2010-08-05T03:39:17Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25462#M4142</link>
      <description>&lt;P&gt;Thanks, that seems to return some results.  However, when I changed the "union" to "intersect", it returned 0 results.  Is the intersect operation only comparing the fields specified in the " | fields " section, or does it actually compare the whole event entry and only mark it as intersect if the two events are exactly the same?&lt;/P&gt;</description>
      <pubDate>Thu, 05 Aug 2010 05:13:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25462#M4142</guid>
      <dc:creator>clincg</dc:creator>
      <dc:date>2010-08-05T05:13:40Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25463#M4143</link>
      <description>&lt;PRE&gt;&lt;CODE&gt; | set union [search sourcetype="first_source" 404 | fields url | fields - _* ] [search sourcetype="second_source" 303 | fields url | fields - _* ]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It's weird, by by default set will try to compare other hidden fields too, such as &lt;CODE&gt;_time&lt;/CODE&gt; and &lt;CODE&gt;_raw&lt;/CODE&gt; which you probably do not want.&lt;/P&gt;

&lt;P&gt;The above search should take care of this.  &lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;HR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Update:&lt;/STRONG&gt;  I've mentioned this to the doc team, and they have now noted this on the &lt;A href="http://www.splunk.com/base/Documentation/latest/SearchReference/Set" rel="nofollow"&gt;set&lt;/A&gt; documentation page.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Aug 2010 01:47:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25463#M4143</guid>
      <dc:creator>Lowell</dc:creator>
      <dc:date>2010-08-06T01:47:44Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25464#M4144</link>
      <description>&lt;P&gt;Thanks not just for the follow-up on Answers but improved documentation as well!&lt;/P&gt;</description>
      <pubDate>Sat, 07 Aug 2010 01:08:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25464#M4144</guid>
      <dc:creator>bwooden</dc:creator>
      <dc:date>2010-08-07T01:08:41Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25465#M4145</link>
      <description>&lt;P&gt;This is not an efficient way to solve this problem. With Splunk, it's much, much better to have the work done in a single search pass with an OR than to run two searches and combine the results.&lt;/P&gt;

&lt;P&gt;Here you should search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(sourcetype=first_source 404) OR (sourcetype=second_source 303) | fields url
(sourcetype=first_source 404) OR (sourcetype=second_source 303) | stats count by url
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Now the efficient version of intersect is very similar. Here we count the number of sourcetypes per url and only keep those urls with more than 1 sourcetype (hence both the sourcetypes):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(sourcetype=first_source 404) OR (sourcetype=second_source 303) | stats count dc(sourcetype) as sourcetypes by url | search sourcetypes &amp;gt; 1 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;These searches can (and will) return an unlimited number of rows, and statistics will be accurate on the entire result set. Using &lt;CODE&gt;set&lt;/CODE&gt; will only pull a limited number of rows from each search.&lt;/P&gt;</description>
      <pubDate>Sat, 21 Aug 2010 04:30:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25465#M4145</guid>
      <dc:creator>Stephen_Sorkin</dc:creator>
      <dc:date>2010-08-21T04:30:07Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25466#M4146</link>
      <description>&lt;P&gt;Makes sense; however, if the two different sourcetypes have different fields (say "url1" and "url2") that I want to union into one field, what's the best way to do that?  &lt;/P&gt;

&lt;P&gt;I'm trying to avoid this delimiter solution below if possible...&lt;BR /&gt;
yoursearchhere |&lt;BR /&gt;
eval output = field1 + ";" + field2 |&lt;BR /&gt;
makemv delim=";" output |&lt;BR /&gt;
mvexpand output&lt;/P&gt;</description>
      <pubDate>Mon, 08 Oct 2012 20:22:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25466#M4146</guid>
      <dc:creator>e_sherlock</dc:creator>
      <dc:date>2012-10-08T20:22:55Z</dc:date>
    </item>
    <item>
      <title>Re: Union / Intersect results from different source type</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25467#M4147</link>
      <description>&lt;P&gt;Hi Stephen,&lt;/P&gt;

&lt;P&gt;I have two big data sets. one has 300 million records with only one field and its a hash. i have another data set which has 20 million records with 3 fields username and hashes. I would like to compare these two sets to find any hashes are matching in two data sets. is there any effective way to do this except the above two solutions as I have already tried them and they are not effective.&lt;/P&gt;</description>
      <pubDate>Thu, 10 Aug 2017 20:33:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Union-Intersect-results-from-different-source-type/m-p/25467#M4147</guid>
      <dc:creator>thambisetty</dc:creator>
      <dc:date>2017-08-10T20:33:58Z</dc:date>
    </item>
  </channel>
</rss>

