<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Understanding Math on the Search Line in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Understanding-Math-on-the-Search-Line/m-p/118794#M31784</link>
    <description>&lt;P&gt;I'm having trouble understanding the math rules on the search line, so instead of continuing to guess what might work and having splunk tell me my formatting is wrong, or give me non sensical results, I'm just going to ask for help.&lt;/P&gt;

&lt;P&gt;I have a lot of data that is aggregated in to 1 minute bins. For each 1 minute bins I have&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;x_i: mean value of my variable as measured in the 1 minute bin&lt;/LI&gt;
&lt;LI&gt;n_i: number of times I measured it in the 1 minute bin&lt;/LI&gt;
&lt;LI&gt;sigma_i: standard deviation of the variable measured in the 1 minute bin&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;I'd then like to move to 1 hour bins so I want to calculate a weighted average of my one minute bins. Mathematically this is very straight forward&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;x_ weighted_ average = sum(weight_ i * x_ i) / sum(weight_i)&lt;/LI&gt;
&lt;LI&gt;where traditionally&lt;/LI&gt;
&lt;LI&gt;weight_ i = 1 / error_ i^2&lt;/LI&gt;
&lt;LI&gt;and for counting statistics &lt;/LI&gt;
&lt;LI&gt;error_ i = stand_ deviation_ i / sqrt(n_ i)&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;putting that together we get&lt;/P&gt;

&lt;P&gt;x_ weighted_ average = sum (x_ i * n_ i / sigma_ i^2) / sum(n_ i)&lt;/P&gt;

&lt;P&gt;What is the best way to do this kind of math in the search line while aggregating? Or should I just pass it to a python script and back out?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Tristan&lt;/P&gt;</description>
    <pubDate>Mon, 28 Oct 2013 23:12:12 GMT</pubDate>
    <dc:creator>tristanmatthews</dc:creator>
    <dc:date>2013-10-28T23:12:12Z</dc:date>
    <item>
      <title>Understanding Math on the Search Line</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Understanding-Math-on-the-Search-Line/m-p/118794#M31784</link>
      <description>&lt;P&gt;I'm having trouble understanding the math rules on the search line, so instead of continuing to guess what might work and having splunk tell me my formatting is wrong, or give me non sensical results, I'm just going to ask for help.&lt;/P&gt;

&lt;P&gt;I have a lot of data that is aggregated in to 1 minute bins. For each 1 minute bins I have&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;x_i: mean value of my variable as measured in the 1 minute bin&lt;/LI&gt;
&lt;LI&gt;n_i: number of times I measured it in the 1 minute bin&lt;/LI&gt;
&lt;LI&gt;sigma_i: standard deviation of the variable measured in the 1 minute bin&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;I'd then like to move to 1 hour bins so I want to calculate a weighted average of my one minute bins. Mathematically this is very straight forward&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;x_ weighted_ average = sum(weight_ i * x_ i) / sum(weight_i)&lt;/LI&gt;
&lt;LI&gt;where traditionally&lt;/LI&gt;
&lt;LI&gt;weight_ i = 1 / error_ i^2&lt;/LI&gt;
&lt;LI&gt;and for counting statistics &lt;/LI&gt;
&lt;LI&gt;error_ i = stand_ deviation_ i / sqrt(n_ i)&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;putting that together we get&lt;/P&gt;

&lt;P&gt;x_ weighted_ average = sum (x_ i * n_ i / sigma_ i^2) / sum(n_ i)&lt;/P&gt;

&lt;P&gt;What is the best way to do this kind of math in the search line while aggregating? Or should I just pass it to a python script and back out?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Tristan&lt;/P&gt;</description>
      <pubDate>Mon, 28 Oct 2013 23:12:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Understanding-Math-on-the-Search-Line/m-p/118794#M31784</guid>
      <dc:creator>tristanmatthews</dc:creator>
      <dc:date>2013-10-28T23:12:12Z</dc:date>
    </item>
    <item>
      <title>Re: Understanding Math on the Search Line</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Understanding-Math-on-the-Search-Line/m-p/118795#M31785</link>
      <description>&lt;P&gt;I'd still do this in the search language but it's up to you.   You break it down into individual &lt;CODE&gt;eval&lt;/CODE&gt; and &lt;CODE&gt;stats&lt;/CODE&gt; and &lt;CODE&gt;bin&lt;/CODE&gt; statements.   For some complex stuff you need &lt;CODE&gt;streamstats&lt;/CODE&gt; and/or &lt;CODE&gt;eventstats&lt;/CODE&gt; as well, or other more advanced commands, but here it's just a lot of brute force &lt;CODE&gt;eval&lt;/CODE&gt; and &lt;CODE&gt;stats&lt;/CODE&gt;.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;foo | bin _time span="1min" 
| stats avg(var1) as x_i count as n_i stdev(var1) as sigma_i by _time 
| eval error_i=sigma_i/sqrt(n_i)
| eval weight_i=1/(error_i*error_i)
| eval weight_times_count = weight_i * x_i
| stats sum(weight_times_count) as sum_wtc sum(weight_i) as sum_weight
| eval x_weighted_average=sum_wtc / sum_weight
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 29 Oct 2013 00:06:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Understanding-Math-on-the-Search-Line/m-p/118795#M31785</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2013-10-29T00:06:17Z</dc:date>
    </item>
  </channel>
</rss>

