<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Exclusion of some events based on a different dataset in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/585977#M204128</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;SPAN&gt;... was experimenting with&amp;nbsp;&lt;/SPAN&gt;&lt;BLOCKQUOTE&gt;&lt;FONT face="courier new,courier"&gt;(basesearch) OR (criteria)&lt;BR /&gt;| stats values(DataSetType) as DataSetType by joinkey&lt;BR /&gt;| where mvcount(DataSetType) == 1&lt;/FONT&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;which is impossibly slower than my original logic.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;HR /&gt;&lt;P&gt;Since this topic is really about avoiding &lt;FONT face="courier new,courier"&gt;join&lt;/FONT&gt; to improve performance, it is worth an update: I have since tested more with "OR" in raw search and noticed that the above conclusion was false; I was testing the two methods in different conditions. &amp;nbsp;Using "OR" is more advantageous.&lt;/P&gt;&lt;P&gt;What motivated this update, though, was Nick Mealy's great 2020 talk&amp;nbsp;&lt;A href="https://conf.splunk.com/files/2020/slides/TRU1761C.pdf" target="_blank" rel="noopener"&gt;Master Joining Datasets Without Using Join&lt;/A&gt;&amp;nbsp;which I watched two years too late&lt;span class="lia-unicode-emoji" title=":face_without_mouth:"&gt;😶&lt;/span&gt;. &amp;nbsp;The first topic? "&lt;SPAN&gt;What’s Wrong With the Join and Append Commands?" (In a nutshell, indexer search and search head search make a big difference.) &amp;nbsp;So, if anyone has the same needs in the future, favor the "&lt;FONT face="courier new,courier"&gt;OR&lt;/FONT&gt;" method to &lt;FONT face="courier new,courier"&gt;append&lt;/FONT&gt;. (But restrict the search as much as practical.) &amp;nbsp;I sure will return to this talk next time I have an urge to "join".&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 22 Feb 2022 01:28:23 GMT</pubDate>
    <dc:creator>yuanliu</dc:creator>
    <dc:date>2022-02-22T01:28:23Z</dc:date>
    <item>
      <title>How to Exclude some events based on a different dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/575352#M200488</link>
      <description>&lt;P&gt;The problem is a simple one: I have a base search from which I want to exclude a subset based on a criteria determined in a different dataset. &amp;nbsp;But I cannot find an efficient way to do this.&lt;/P&gt;
&lt;P&gt;So far, what I am doing is&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;basesearch
| join joinkey
  [set diff
    [ basesearch
      | stats count by joinkey
      | fields - count ]
    [ criteria
      | stats count by joinkey
      | fields - count ]
  ]&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;While the logic works, it feels immensely inefficient. &amp;nbsp;Without even considering that set operations is itself expensive, but basesearch is performed two times with no change.&lt;/P&gt;
&lt;P&gt;What is the proper way of doing this simple exclusion?&lt;/P&gt;</description>
      <pubDate>Tue, 22 Feb 2022 15:12:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/575352#M200488</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-02-22T15:12:30Z</dc:date>
    </item>
    <item>
      <title>Re: Exclusion of some events based on a different dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/575360#M200490</link>
      <description>&lt;P&gt;Are you looking for this type of exclusion&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults count=10000
| eval key=random() % 4000
| eval DataSetType="Type 1"
| append [
  | makeresults count=10000
  | eval key=random() % 4000
  | eval DataSetType="Type 2"
]
| stats count values(DataSetType) as Types by key
| where mvcount(Types)=1 OR count=1&lt;/LI-CODE&gt;&lt;P&gt;If you search both data sets in your outer search, then use stats to aggregate all values based on your joinkey, then you can usually test for a) count=1 or b) mvcount(somefield)=1 AND&amp;nbsp; somefield="your criteria"&lt;/P&gt;&lt;P&gt;Is that what you're trying to do?&lt;/P&gt;</description>
      <pubDate>Thu, 18 Nov 2021 02:36:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/575360#M200490</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2021-11-18T02:36:14Z</dc:date>
    </item>
    <item>
      <title>Re: Exclusion of some events based on a different dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/575579#M200565</link>
      <description>&lt;P&gt;Thanks,&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/6367"&gt;@bowesmana&lt;/a&gt;! &amp;nbsp;I always forget the great append. &amp;nbsp;I was experimenting with&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;(basesearch) OR (criteria)
| stats values(DataSetType) as DataSetType by joinkey
| where mvcount(DataSetType) == 1&lt;/LI-CODE&gt;&lt;P&gt;which is impossibly slower than my original logic.&lt;/P&gt;&lt;P&gt;I usually use stats to limit number of fields as early as possible after search. &amp;nbsp;But for this test, I also tried eventstats in order to preserve useful fields. &amp;nbsp;What I discover is that if I do not limit number of fields in `criteria` search, eventstats-where makes the alternative slightly slower than join-set-diff.&lt;/P&gt;&lt;P&gt;But because for my purposes, `criteria` search is only used for that one key, I can limit append to joinkey + DataSetType. &amp;nbsp;The net result is slightly faster join-set-diff &lt;EM&gt;even with eventstats-where&lt;/EM&gt;. (My test dataset is not huge. &amp;nbsp;The advantage will be bigger with bigger dataset.)&lt;/P&gt;</description>
      <pubDate>Fri, 19 Nov 2021 02:34:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/575579#M200565</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2021-11-19T02:34:55Z</dc:date>
    </item>
    <item>
      <title>Re: Exclusion of some events based on a different dataset</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/585977#M204128</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;SPAN&gt;... was experimenting with&amp;nbsp;&lt;/SPAN&gt;&lt;BLOCKQUOTE&gt;&lt;FONT face="courier new,courier"&gt;(basesearch) OR (criteria)&lt;BR /&gt;| stats values(DataSetType) as DataSetType by joinkey&lt;BR /&gt;| where mvcount(DataSetType) == 1&lt;/FONT&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;which is impossibly slower than my original logic.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;HR /&gt;&lt;P&gt;Since this topic is really about avoiding &lt;FONT face="courier new,courier"&gt;join&lt;/FONT&gt; to improve performance, it is worth an update: I have since tested more with "OR" in raw search and noticed that the above conclusion was false; I was testing the two methods in different conditions. &amp;nbsp;Using "OR" is more advantageous.&lt;/P&gt;&lt;P&gt;What motivated this update, though, was Nick Mealy's great 2020 talk&amp;nbsp;&lt;A href="https://conf.splunk.com/files/2020/slides/TRU1761C.pdf" target="_blank" rel="noopener"&gt;Master Joining Datasets Without Using Join&lt;/A&gt;&amp;nbsp;which I watched two years too late&lt;span class="lia-unicode-emoji" title=":face_without_mouth:"&gt;😶&lt;/span&gt;. &amp;nbsp;The first topic? "&lt;SPAN&gt;What’s Wrong With the Join and Append Commands?" (In a nutshell, indexer search and search head search make a big difference.) &amp;nbsp;So, if anyone has the same needs in the future, favor the "&lt;FONT face="courier new,courier"&gt;OR&lt;/FONT&gt;" method to &lt;FONT face="courier new,courier"&gt;append&lt;/FONT&gt;. (But restrict the search as much as practical.) &amp;nbsp;I sure will return to this talk next time I have an urge to "join".&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Feb 2022 01:28:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-Exclude-some-events-based-on-a-different-dataset/m-p/585977#M204128</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-02-22T01:28:23Z</dc:date>
    </item>
  </channel>
</rss>

