<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to create alert when a value is an outlier in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570402#M101089</link>
    <description>&lt;P&gt;If your search executions are distinct runs of the search where you only have access to the existing run data, then you will need to do either&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Change that search so that it calculates 2 (or more) ranges of the result set, i.e. execution 2 calculates the results for execution 1 and 2 and then you will have the data to compare changes. OR&lt;/LI&gt;&lt;LI&gt;Change the existing search so that it&lt;UL&gt;&lt;LI&gt;Having calculated the values, lookup the table name from a new lookup file that contains table_name, partitions, rows - assuming you have multiple tables. You can calculate the variance. You can then save the latest results back to the lookup table for the next iteration.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;That would raise some issues you'd have to deal with, i.e. in you execution 4, when it goes back to 'normal' if you have just saved 10000, then there will again be a variance, but that can be dealt with.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 10 Oct 2021 21:54:24 GMT</pubDate>
    <dc:creator>bowesmana</dc:creator>
    <dc:date>2021-10-10T21:54:24Z</dc:date>
    <item>
      <title>How to create alert when a value is an outlier</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570122#M101055</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I'm currently trying to use splunk to create an alert for the following scenario:&lt;/P&gt;&lt;P&gt;I have a search that tell's me the number os rows and partitions a data pipeline ingested, so basically i already extract the following fields:&lt;BR /&gt;- Table Name&lt;BR /&gt;- Number of partitions&lt;BR /&gt;- Number of rows&lt;/P&gt;&lt;P&gt;I also have a dashboard that shows me the timechart of the number of partitions and rows across different executions during the time.&lt;/P&gt;&lt;P&gt;What i need in this example, is to have an alert that get triggered when the number of the partitions or rows have more than a specified % of difference between executions. So in this example, the executions 1 and 2 have a low difference between then, but the execution 3 is clearly an outlier, that should be alerted.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Execution 1:&amp;nbsp;&lt;BR /&gt;table_name = table_1&lt;BR /&gt;num_part&amp;nbsp; &amp;nbsp; &amp;nbsp;= 12&lt;BR /&gt;num_rows&amp;nbsp; &amp;nbsp;= 1400&lt;BR /&gt;&lt;BR /&gt;Execution 2:&lt;BR /&gt;table_name = table_1&lt;BR /&gt;&lt;SPAN&gt;num_part = 10&lt;BR /&gt;num_rows = 1000&lt;BR /&gt;&lt;BR /&gt;Execution 3:&lt;BR /&gt;table_name = table_1&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;num_part = 10000&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;num_rows = 100000000&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any sugestions on how i can do it?&lt;/P&gt;</description>
      <pubDate>Thu, 07 Oct 2021 19:09:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570122#M101055</guid>
      <dc:creator>nochimows</dc:creator>
      <dc:date>2021-10-07T19:09:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to create alert when a value is an outlier</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570269#M101077</link>
      <description>&lt;P&gt;Its possible,&lt;/P&gt;&lt;P&gt;Unless you have Splunk MLTK you might have to do some statistics&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;Take a look at this article that talks about how to identify the an outlier and then use IQR to identify outliers.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/8.2.2/Search/Findingandremovingoutliers" target="_self"&gt;https://docs.splunk.com/Documentation/Splunk/8.2.2/Search/Findingandremovingoutliers&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Oct 2021 17:31:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570269#M101077</guid>
      <dc:creator>Stefanie</dc:creator>
      <dc:date>2021-10-08T17:31:08Z</dc:date>
    </item>
    <item>
      <title>Re: How to create alert when a value is an outlier</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570402#M101089</link>
      <description>&lt;P&gt;If your search executions are distinct runs of the search where you only have access to the existing run data, then you will need to do either&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Change that search so that it calculates 2 (or more) ranges of the result set, i.e. execution 2 calculates the results for execution 1 and 2 and then you will have the data to compare changes. OR&lt;/LI&gt;&lt;LI&gt;Change the existing search so that it&lt;UL&gt;&lt;LI&gt;Having calculated the values, lookup the table name from a new lookup file that contains table_name, partitions, rows - assuming you have multiple tables. You can calculate the variance. You can then save the latest results back to the lookup table for the next iteration.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;That would raise some issues you'd have to deal with, i.e. in you execution 4, when it goes back to 'normal' if you have just saved 10000, then there will again be a variance, but that can be dealt with.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 10 Oct 2021 21:54:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570402#M101089</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2021-10-10T21:54:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to create alert when a value is an outlier</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570873#M101153</link>
      <description>&lt;P&gt;Hi all, thank you for the tips.&lt;/P&gt;&lt;P&gt;I'm trying the following approach, i created a dataset with the global statistics of each table.&amp;nbsp;&lt;BR /&gt;Now, i'm trying to join the results of my search with the results of my dataset where the column "Table" is the same, for i can create a column "IsOutlier" using a if statement reading my dataset.&lt;/P&gt;&lt;P&gt;I wanted to do something like this:&lt;/P&gt;&lt;P&gt;... my search that returns table_name and number_rows | eval isOutlier=if(number_rows &amp;lt; mydataset.lowerBound where mydataset.table = "table_name" OR number_rows &amp;gt; mydataset.upperBound where mydataset.table = table_name, 1, 0)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What's the right way to write such a statement on a Splunk Search?&lt;/P&gt;</description>
      <pubDate>Wed, 13 Oct 2021 23:05:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-create-alert-when-a-value-is-an-outlier/m-p/570873#M101153</guid>
      <dc:creator>nochimows</dc:creator>
      <dc:date>2021-10-13T23:05:00Z</dc:date>
    </item>
  </channel>
</rss>

