<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Finding matches between 3 different indexes. in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/506487#M141709</link>
    <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/64317"&gt;@rnowitzki&lt;/a&gt;&amp;nbsp;For some time I was not able to reply to this post. Anyway, the solution you proposed worked and it is around 5 time faster than using the join command. Thanks again for helping me out with this one.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 29 Jun 2020 08:29:21 GMT</pubDate>
    <dc:creator>assennikolov</dc:creator>
    <dc:date>2020-06-29T08:29:21Z</dc:date>
    <item>
      <title>Finding matches between 3 different indexes.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470029#M132266</link>
      <description>&lt;P&gt;I have the following case: I have 3 different indexes (A, B and C). My goal is to find what percentage of the devices found in index B could also be found in index C. &lt;/P&gt;

&lt;P&gt;In index A I have fields asset_name and mac_address (~1000 different devices)&lt;BR /&gt;
In index B I have field src_mac (~150 different devices)&lt;BR /&gt;
In index C I have field asset_name (~6000) different devices&lt;/P&gt;

&lt;P&gt;Basically, I firstly try to find the asset names of all 150 hosts from index B by looking into index A. And then I compare the newly found asset names to the asset names of index C.&lt;/P&gt;

&lt;P&gt;index="C"&lt;BR /&gt;
| stats count by asset_name&lt;BR /&gt;
| join type=left [search (index=A OR index=B)&lt;BR /&gt;
  | eval all_macAddresses = coalesce(src_mac, mac_address)&lt;BR /&gt;
  | stats values(asset_name) as asset_name values(src_mac) as src_mac values(mac_address) as mac_address by all_macAddresses&lt;BR /&gt;
  | eval match = if(src_mac == mac_address, "match", "no_match") &lt;BR /&gt;
  | where match="match"&lt;BR /&gt;
  | table asset_name all_macAddresses]&lt;BR /&gt;
| eval new=if(isnull(all_macAddresses,"NOT_OK","OK")&lt;BR /&gt;
| stats count by new&lt;/P&gt;

&lt;P&gt;I managed to get the results by using the above search but I was wondering whether this could be achieved without using any subsearches.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 05:36:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470029#M132266</guid>
      <dc:creator>assennikolov</dc:creator>
      <dc:date>2020-09-30T05:36:26Z</dc:date>
    </item>
    <item>
      <title>Re: Finding matches between 3 different indexes.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470030#M132267</link>
      <description>&lt;P&gt;It is difficult without having the actual data, but this might work:&lt;/P&gt;

&lt;P&gt;Instead of searching just index="C" in the first place, search all 3 indexes, so you have all the data in the search results. Then:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;    | eval all_macAddresses = coalesce(src_mac, mac_address)
    | stats values(testindex) as testindex  values(all_macAddresses) as all_macAddresses  by asset_name
    | where NOT testindex="B"
    | eval match=mvfilter(testindex LIKE "A%")
    | eval match=match+","+mvfilter(testindex LIKE "C%")
    | eval match=if(match="A,C", "OK", "NOT_OK")
    | stats count by match
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(I created a small table with makeresults that has a field "testindex" to simulate your different indexes, so just change testindex to _index)&lt;/P&gt;

&lt;P&gt;This was quick and dirty, I guess the handling of the mvfield can be made more efficient, like for example directly filtering somehow for &lt;EM&gt;A and C&lt;/EM&gt; instead of putting those in a new field.&lt;/P&gt;

&lt;P&gt;The basic idea is to use stats with all the data instead of joining 2 subsets. So if my SPL does not work &lt;EM&gt;as is&lt;/EM&gt; with your real data, I hope I could put you in the right direction.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jun 2020 11:39:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470030#M132267</guid>
      <dc:creator>rnowitzki</dc:creator>
      <dc:date>2020-06-03T11:39:35Z</dc:date>
    </item>
    <item>
      <title>Re: Finding matches between 3 different indexes.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470031#M132268</link>
      <description>&lt;P&gt;Thanks for the fast reply. I tested your search syntax, however end results shows:&lt;/P&gt;

&lt;P&gt;match         count&lt;BR /&gt;
NOT_OK     4462&lt;BR /&gt;
OK               1066 (~150 expected)&lt;/P&gt;

&lt;P&gt;So I assume that at the end of the day it compares only index A to index C (given that index B has around 150 devices only).&lt;/P&gt;

&lt;P&gt;Still, your reply was from a great help and I will work on filtering first the asset names from index A which MAC addresses are found in index B.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jun 2020 07:57:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470031#M132268</guid>
      <dc:creator>assennikolov</dc:creator>
      <dc:date>2020-06-04T07:57:48Z</dc:date>
    </item>
    <item>
      <title>Re: Finding matches between 3 different indexes.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470032#M132269</link>
      <description>&lt;P&gt;As I still had my browser tab with sample data open, I further played around with it.&lt;BR /&gt;
I think I got it, but you would have to test it against your real data:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;  | eval all_macAddresses = if(testindex="C","C",coalesce(src_mac, mac_address))
  | stats values(testindex) as testindex, values(asset_name) as asset_name    by all_macAddresses
  | mvexpand asset_name
  | stats values(testindex) as testindex by asset_name
  | where asset_name!=""
  | eval match=mvcount(testindex)
  | eval match=if(match=3, "OK", "NOT_OK")
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It is based on the assumption that the assets you are looking for will have data in all three indexes: either hostname or mac or both in A, B or C.  So the match against against the mvcount of (test)index column is checking for "3" at the end. Those should be the matching ones.&lt;BR /&gt;
I put "C" as macaddress in the first eval for assets in the C index, just because the following stats command would remove all "C" data otherwise (if mac is empty)...maybe remove some lines one after another to see how the data is put together.&lt;/P&gt;

&lt;P&gt;Again, replace &lt;EM&gt;testindex&lt;/EM&gt; with &lt;EM&gt;_index&lt;/EM&gt; to make it run for you.&lt;/P&gt;

&lt;P&gt;Would be interesting to see if and how much faster this is compared to the &lt;EM&gt;join&lt;/EM&gt; solution.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jun 2020 09:53:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/470032#M132269</guid>
      <dc:creator>rnowitzki</dc:creator>
      <dc:date>2020-06-04T09:53:28Z</dc:date>
    </item>
    <item>
      <title>Re: Finding matches between 3 different indexes.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/506487#M141709</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/64317"&gt;@rnowitzki&lt;/a&gt;&amp;nbsp;For some time I was not able to reply to this post. Anyway, the solution you proposed worked and it is around 5 time faster than using the join command. Thanks again for helping me out with this one.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jun 2020 08:29:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Finding-matches-between-3-different-indexes/m-p/506487#M141709</guid>
      <dc:creator>assennikolov</dc:creator>
      <dc:date>2020-06-29T08:29:21Z</dc:date>
    </item>
  </channel>
</rss>

