<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to represent normal distribution in a graph format using mean and stdev values? in All Apps and Add-ons</title>
    <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372195#M45011</link>
    <description>&lt;P&gt;In my case, My whole concept is  I want to find type of the distribution on my data. Based on the distribution type I want to find that the data has any outliers or not, if not I want to find the mean and stdev. First I am not sure hoe to represent distribution type in graphic mode using Splunk enterprise not using Machine learning toolkit. Can anyone help on this. &lt;/P&gt;

&lt;P&gt;NOTE: Correct me if my concept is wrong cause I am new Data Science/Machine Learning concepts. &lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Chandana&lt;/P&gt;</description>
    <pubDate>Wed, 02 May 2018 16:38:45 GMT</pubDate>
    <dc:creator>chandana204</dc:creator>
    <dc:date>2018-05-02T16:38:45Z</dc:date>
    <item>
      <title>How to represent normal distribution in a graph format using mean and stdev values?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372192#M45008</link>
      <description>&lt;P&gt;I want to represent normal distribution in a graph format using mean and stdev values. Is it possible in Splunk enterprise? &lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Chandana &lt;/P&gt;</description>
      <pubDate>Mon, 30 Apr 2018 21:53:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372192#M45008</guid>
      <dc:creator>chandana204</dc:creator>
      <dc:date>2018-04-30T21:53:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to represent normal distribution in a graph format using mean and stdev values?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372193#M45009</link>
      <description>&lt;P&gt;hi @chandana204 - there are a few ways you could do this, but depending on what you're trying to do, some might make more sense than others. what is the purpose of graphing the mean and stdev? Are you looking for a z-test of sorts, for example? It may be easier to calculate some of the statistics without a graph of the full distribution. &lt;/P&gt;

&lt;P&gt;Could you tell us more about what you're trying to do?&lt;/P&gt;</description>
      <pubDate>Mon, 30 Apr 2018 22:22:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372193#M45009</guid>
      <dc:creator>aljohnson_splun</dc:creator>
      <dc:date>2018-04-30T22:22:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to represent normal distribution in a graph format using mean and stdev values?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372194#M45010</link>
      <description>&lt;P&gt;Here's a pure SPL version. Its kinda terrible and fun.&lt;/P&gt;

&lt;P&gt;First, lets generate ten thousand values from a &lt;A href="https://en.wikipedia.org/wiki/Normal_distribution#Standard_normal_distribution"&gt;standard normal distribution&lt;/A&gt; a using the &lt;A href="https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform"&gt;box-muller transform&lt;/A&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults count=10000
| eval standard_normal = sqrt(-2 * ln((random() / (pow(2, 31) -1)))) * cos((2*pi()*(random() / (pow(2, 31) -1))))
| fields x standard_normal
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Gives us this - 10,000 samples from normal standard distribution &lt;BR /&gt;
&lt;IMG src="https://i.imgur.com/nPOEkVs.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;Since we might want to make multiple of these, we can put it into a macro (read more about making those &lt;A href="http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/Knowledge/Definesearchmacros"&gt;here&lt;/A&gt;), which I'll just call &lt;CODE&gt;makenormal(3)&lt;/CODE&gt; - it is parameterized by the number of samples, the mean, and standard deviation. We just scale by the standard deviation, and add the mean to unstandardize our samples. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;appendpipe [
| makeresults count=$count$
| eval normal_mu$mean$_sigma$stdev$ = (sqrt(-2 * ln((random() / (pow(2, 31) -1)))) * cos((2*pi()* . (random() / (pow(2, 31) -1)))) * $stdev$) + $mean$
| fields - _time
]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Now that we have our macro, we can use it to compare two values... however, getting them to share the same binned x values (for what you might expect of a standard graph) requires a little SPL trickery.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| stats count
| `makenormal(10000, 10, 2)`
| `makenormal(100000, 2, 10)`
| foreach normal* [bin &amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt; span=0.5]
| untable count field value
| eventstats count by value field
| xyseries value field count
| sort value
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;which gives us a visualization like this:&lt;/P&gt;

&lt;P&gt;&lt;IMG src="https://i.imgur.com/Qo16oMG.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;To be frank, I really doubt any one would want to do this in Splunk in this exact fashion... but if you expand your question with more information about what kind of problems you want to solve, or how you intend to use these values from a normal distribution - there are lots of better ways I'm sure. For example, we could use custom search commands, a custom visualization, or even a custom algorithm in the machine learning toolkit to achieve the same thing in a much simpler fashion. &lt;/P&gt;

&lt;P&gt;But I just wanted to show it could be done in SPL, too &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 01 May 2018 02:54:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372194#M45010</guid>
      <dc:creator>aljohnson_splun</dc:creator>
      <dc:date>2018-05-01T02:54:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to represent normal distribution in a graph format using mean and stdev values?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372195#M45011</link>
      <description>&lt;P&gt;In my case, My whole concept is  I want to find type of the distribution on my data. Based on the distribution type I want to find that the data has any outliers or not, if not I want to find the mean and stdev. First I am not sure hoe to represent distribution type in graphic mode using Splunk enterprise not using Machine learning toolkit. Can anyone help on this. &lt;/P&gt;

&lt;P&gt;NOTE: Correct me if my concept is wrong cause I am new Data Science/Machine Learning concepts. &lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Chandana&lt;/P&gt;</description>
      <pubDate>Wed, 02 May 2018 16:38:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372195#M45011</guid>
      <dc:creator>chandana204</dc:creator>
      <dc:date>2018-05-02T16:38:45Z</dc:date>
    </item>
    <item>
      <title>Re: How to represent normal distribution in a graph format using mean and stdev values?</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372196#M45012</link>
      <description>&lt;P&gt;Hey @chandana204 - there are a few commands that use similar approaches to look for outliers in Splunk. Try checking out the &lt;CODE&gt;anomalydetection&lt;/CODE&gt; command. &lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Anomalydetection"&gt;http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Anomalydetection&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;If you just want to find the mean and stdev, you can checkout the stats command, e.g.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| stats mean(field) as mean, stdev(field) as stdev
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If you are just looking for numeric outliers, you can try using the "Detect Numeric Outliers" assistant in the MLTK.&lt;/P&gt;

&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/MLApp/3.2.0/User/DetectNumericOutliers"&gt;https://docs.splunk.com/Documentation/MLApp/3.2.0/User/DetectNumericOutliers&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;As for fitting your values directly to a distribution and checking how well it fits, you could hypothetically try to use a custom algorithm in the machine learning toolkit to use a &lt;A href="https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test"&gt;kolomogorov-smirnoff test&lt;/A&gt; or a &lt;A href="https://en.wikipedia.org/wiki/Anderson%E2%80%93Darling_test"&gt;anderson-darling test&lt;/A&gt; to see if your samples match some distribution.  You can read about custom algorithms &lt;A href="http://docs.splunk.com/Documentation/MLApp/3.2.0/API/Introduction"&gt;here&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 03 May 2018 23:31:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/How-to-represent-normal-distribution-in-a-graph-format-using/m-p/372196#M45012</guid>
      <dc:creator>aljohnson_splun</dc:creator>
      <dc:date>2018-05-03T23:31:50Z</dc:date>
    </item>
  </channel>
</rss>

