<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Cumulative sum of a distinct count over time. in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60349#M180131</link>
    <description>&lt;P&gt;I think I follow the logic here, will have to experiment.  I ran into a 10000 row limit on this one though.  I did find a "hack" that &lt;EM&gt;seems&lt;/EM&gt; to do it:&lt;/P&gt;

&lt;P&gt;...| bin _time span=6h as hour | stats values(flowers) as cflowers dc(flowers) as "Flowers per hour" by hour | streamstats dc(cflowers) as "Cumulative total" | eval Time=strftime(hour, "%m-%d.%H") | table Time, "Flowers per hour", "Cumulative total"&lt;/P&gt;</description>
    <pubDate>Sun, 29 Jan 2012 19:26:43 GMT</pubDate>
    <dc:creator>howyagoin</dc:creator>
    <dc:date>2012-01-29T19:26:43Z</dc:date>
    <item>
      <title>Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60340#M180122</link>
      <description>&lt;P&gt;I've got a variable, call it "flowers," related to orders from a shop.  I'm trying to get a chart of the number of unique flower types purchased over a time range, as well as the cumulative total.&lt;/P&gt;

&lt;P&gt;I've found another answer here which gave the hint of:&lt;/P&gt;

&lt;P&gt;searchterm  | stats values(flowers) as fcount dc(flowers) by date_minute | streamstats dc(fcount) as "Cumulative total"&lt;/P&gt;

&lt;P&gt;And whilst that works, it breaks down the number of unique flowers per a given minute, across the date range -- all fine if that range is within a given hour, but, otherwise it's not quite the right way to break this down.&lt;/P&gt;

&lt;P&gt;What I'm after though is a way to look at 36 hours worth of flower sales, and for each hour, or minute, see the number of unique flower types sold, but also have the cumulative total.&lt;/P&gt;

&lt;P&gt;As I interpret the above example, it's recycling "minutes" (or hours if I use date_hour) so that the 12th minute across two days is only represented by a single value..&lt;/P&gt;

&lt;P&gt;Missing something obvious here.....&lt;/P&gt;</description>
      <pubDate>Sat, 28 Jan 2012 07:01:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60340#M180122</guid>
      <dc:creator>howyagoin</dc:creator>
      <dc:date>2012-01-28T07:01:12Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60341#M180123</link>
      <description>&lt;P&gt;I think you may want this :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... earliest=-36h | timechart span=1h dc(flower) AS "# of flower types purchased" | streamstats sum("# of flower types purchased") AS total
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm not sure how relevant the cumulative total will be though, as it is only representative of the sum of distinct counts over time, which obviously will have duplicate flower types.&lt;/P&gt;

&lt;P&gt;You might find the results of this search more interesting as they will expose similar information :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... earliest=-36h | timechart span=1h count by flower | addtotals | appendcols [search ... | timechart span=1h dc(flower) AS "# of flower types purchased"]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This will provide the number of purchases per flower type, the total number of purchases, and the distinct count of flower types purchased for every hour.&lt;/P&gt;

&lt;P&gt;For more detais and examples, check the Search Reference manual records for &lt;A href="http://docs.splunk.com/Documentation/Splunk/4.2.5/SearchReference/Timechart"&gt;timechart&lt;/A&gt;, &lt;A href="http://docs.splunk.com/Documentation/Splunk/4.2.5/SearchReference/Appendcols"&gt;appendcols&lt;/A&gt; and &lt;A href="http://docs.splunk.com/Documentation/Splunk/4.2.5/SearchReference/Addtotals"&gt;addtotals&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Sat, 28 Jan 2012 16:59:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60341#M180123</guid>
      <dc:creator>hexx</dc:creator>
      <dc:date>2012-01-28T16:59:11Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60342#M180124</link>
      <description>&lt;P&gt;Both of these are interesting, thanks, gives me something to play with.  What I was hoping to accomplish though was to have a graph of time on the X axis, number of flowers on the Y, with one line representing the number of unique flowers per that increment of time (hour/minute, whatever) -- but a second line representing the cumulative total over all time, rather than just for that unit of time.&lt;/P&gt;

&lt;P&gt;So the second line would never, in theory, go down; if there were 2 unique flowers at noon on Monday, and 4 at 1300, and a total of 5 unique between those times, the second line should be at "5" on Y&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 02:23:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60342#M180124</guid>
      <dc:creator>howyagoin</dc:creator>
      <dc:date>2012-01-29T02:23:51Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60343#M180125</link>
      <description>&lt;P&gt;I see. How about :&lt;BR /&gt;&lt;BR /&gt;
&lt;CODE&gt;... earliest=-36h | timechart span=1h count, dc(flower) AS "Distinct count of flower types purchased" | streamstats sum(count) AS "Total flowers purchased" | fields - count&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 02:36:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60343#M180125</guid>
      <dc:creator>hexx</dc:creator>
      <dc:date>2012-01-29T02:36:03Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60344#M180126</link>
      <description>&lt;P&gt;This is a bit tricky, but possible. See my answer.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 17:35:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60344#M180126</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2012-01-29T17:35:41Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60345#M180127</link>
      <description>&lt;P&gt;hmm. maybe not. &lt;CODE&gt;streamstats&lt;/CODE&gt; doesn't work with &lt;CODE&gt;sistats&lt;/CODE&gt; as I had hoped.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 17:40:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60345#M180127</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2012-01-29T17:40:56Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60346#M180128</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;...
| sort _time 
| streamstats global=f window=0 current=t 
  values(flowers) as v_flowers
| eval v_flowers=mvjoin(v_flowers,";")
| timechart span=1h dc(flowers) as cur_dc, last(v_flowers) as cum_dc
| eval cum_dc=mvcount(split(cum_dc,";"))
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The main tricks are (a) you need to sort and get the cumulative count first, and (b) convert the list of items from a multivalue field since it seems that the &lt;CODE&gt;timechart&lt;/CODE&gt;'s &lt;CODE&gt;last()&lt;/CODE&gt; function doesn't preserve multivalues. &lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 18:18:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60346#M180128</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2012-01-29T18:18:43Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60347#M180129</link>
      <description>&lt;P&gt;Try what I came up and posted now.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 18:19:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60347#M180129</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2012-01-29T18:19:07Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60348#M180130</link>
      <description>&lt;P&gt;@howyagoin : Oh ok, you want a &lt;STRONG&gt;distinct count over the entire period&lt;/STRONG&gt;, not an on-going cumulative sum of the item sales for each cycle.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 19:16:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60348#M180130</guid>
      <dc:creator>hexx</dc:creator>
      <dc:date>2012-01-29T19:16:26Z</dc:date>
    </item>
    <item>
      <title>Re: Cumulative sum of a distinct count over time.</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60349#M180131</link>
      <description>&lt;P&gt;I think I follow the logic here, will have to experiment.  I ran into a 10000 row limit on this one though.  I did find a "hack" that &lt;EM&gt;seems&lt;/EM&gt; to do it:&lt;/P&gt;

&lt;P&gt;...| bin _time span=6h as hour | stats values(flowers) as cflowers dc(flowers) as "Flowers per hour" by hour | streamstats dc(cflowers) as "Cumulative total" | eval Time=strftime(hour, "%m-%d.%H") | table Time, "Flowers per hour", "Cumulative total"&lt;/P&gt;</description>
      <pubDate>Sun, 29 Jan 2012 19:26:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Cumulative-sum-of-a-distinct-count-over-time/m-p/60349#M180131</guid>
      <dc:creator>howyagoin</dc:creator>
      <dc:date>2012-01-29T19:26:43Z</dc:date>
    </item>
  </channel>
</rss>

