<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Parallel stats - most efficient structure in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368083#M163084</link>
    <description>&lt;P&gt;you could do something like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;|multireport
[|bucket _time span=1min| stats DC(id) as unique_devices_minute BY device_type _time|eval period="minute"]
[|bucket _time span=1h| stats DC(id) as unique_devices_hour BY device_type _time|eval period="hour"]
[|bucket _time span=1d| stats DC(id) as unique_devices_day BY device_type _time|eval period="day"]
| eval unique_devices=CASE(period="minute",unique_devices_minute,period="hour",unique_devices_hour, period="day",unique_devices_day )
| fields - unique_devices_minute, unique_devices_hour, unique_devices_day
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I see that you're doing some sort of summing by minute/hour/day before you eval &lt;CODE&gt;unique_devices&lt;/CODE&gt; in separate appendpipes, but i'm not entirely sure what that's doing in the end. some sample data and desired output might be more helpful. &lt;/P&gt;</description>
    <pubDate>Wed, 07 Feb 2018 13:11:39 GMT</pubDate>
    <dc:creator>cmerriman</dc:creator>
    <dc:date>2018-02-07T13:11:39Z</dc:date>
    <item>
      <title>Parallel stats - most efficient structure</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368081#M163082</link>
      <description>&lt;P&gt;I frequently have to create stats reports where some parts are, essentially, executable in parallel with others.  An example would be:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;search &amp;lt;something&amp;gt; 
  | appendpipe [ eval _time=FLOOR(_time/60)*60 | stats DC(id) unique_devices_minute BY device_type, _time ]
  | appendpipe [ eval _time=FLOOR(_time/3600)*3600 | stats DC(id) unique_devices_hour BY device_type, _time ]
  | appendpipe [ eval _time=FLOOR(_time/86400)*86400 | stats DC(id) unique_devices_day BY device_type, _time ]
  | eval _time=FLOOR(_time/60)*60
  | stats &amp;lt;&amp;lt;some summaries&amp;gt;&amp;gt; SUM(unique_devices_minute) AS unique_devices_minute, SUM(unique_devices_hour) AS unique_devices_hour, SUM(unique_devices_day) AS unique_devices_day BY device_type, _time
  | eval period="minute"
  | appendpipe [ eval _time=FLOOR(_time/3600)*3600 
    | stats &amp;lt;&amp;lt;sum per-minute to per-hour&amp;gt;&amp;gt; SUM(unique_devices_hour) AS unique_devices_hour, SUM(unique_devices_day) AS unique_devices_day BY device_type, _time
    | eval period="hour" ]
  | appendpipe [ where period="hour" | eval _time=FLOOR(_time/86400)*86400 
    | stats &amp;lt;&amp;lt;sum per-hour to per-day&amp;gt;&amp;gt;  SUM(unique_devices_day) AS unique_devices_day BY device_type, _time
    | eval period="day" ]
  | eval unique_devices=CASE(period="minute",unique_devices_minute,period="hour",unique_devices_hour,eval period="day",unique_devices_day ) | fields - unique_devices_minute, unique_devices_hour, unique_devices_day 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This gives the results I want in a single report, but is it the most efficient way to structure this?&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 11:18:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368081#M163082</guid>
      <dc:creator>JeToJedno</dc:creator>
      <dc:date>2018-02-07T11:18:26Z</dc:date>
    </item>
    <item>
      <title>Re: Parallel stats - most efficient structure</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368082#M163083</link>
      <description>&lt;P&gt;Before anyone says anything, the first appendpipe (line 2) is unnecessary and can be part of the stats command in line 6.&lt;/P&gt;

&lt;P&gt;... and there's a spurious "eval" in line 14. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 11:20:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368082#M163083</guid>
      <dc:creator>JeToJedno</dc:creator>
      <dc:date>2018-02-07T11:20:09Z</dc:date>
    </item>
    <item>
      <title>Re: Parallel stats - most efficient structure</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368083#M163084</link>
      <description>&lt;P&gt;you could do something like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;|multireport
[|bucket _time span=1min| stats DC(id) as unique_devices_minute BY device_type _time|eval period="minute"]
[|bucket _time span=1h| stats DC(id) as unique_devices_hour BY device_type _time|eval period="hour"]
[|bucket _time span=1d| stats DC(id) as unique_devices_day BY device_type _time|eval period="day"]
| eval unique_devices=CASE(period="minute",unique_devices_minute,period="hour",unique_devices_hour, period="day",unique_devices_day )
| fields - unique_devices_minute, unique_devices_hour, unique_devices_day
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I see that you're doing some sort of summing by minute/hour/day before you eval &lt;CODE&gt;unique_devices&lt;/CODE&gt; in separate appendpipes, but i'm not entirely sure what that's doing in the end. some sample data and desired output might be more helpful. &lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 13:11:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368083#M163084</guid>
      <dc:creator>cmerriman</dc:creator>
      <dc:date>2018-02-07T13:11:39Z</dc:date>
    </item>
    <item>
      <title>Re: Parallel stats - most efficient structure</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368084#M163085</link>
      <description>&lt;P&gt;The other stats calculated are always average, peak &amp;amp; maximum request rates, and sometimes first-seen-time (in day) unique devices count.  Peak rate = max of 2 sec moving average request rate or a close approximation (e.g. 98th percentile within minute of per-second request rate, and then max of per-minute 98th percentile for hour &amp;amp; day).&lt;/P&gt;

&lt;P&gt;I was hoping to avoid making multiple independent passes through the whole dataset, but I can see now that's probably not going to be possible as the count distinct over a different time period will always require a separate pass through the base data. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Feb 2018 13:36:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Parallel-stats-most-efficient-structure/m-p/368084#M163085</guid>
      <dc:creator>JeToJedno</dc:creator>
      <dc:date>2018-02-07T13:36:48Z</dc:date>
    </item>
  </channel>
</rss>

