I've got a variable, call it "flowers," related to orders from a shop. I'm trying to get a chart of the number of unique flower types purchased over a time range, as well as the cumulative total.
I've found another answer here which gave the hint of:
searchterm | stats values(flowers) as fcount dc(flowers) by date_minute | streamstats dc(fcount) as "Cumulative total"
And whilst that works, it breaks down the number of unique flowers per a given minute, across the date range -- all fine if that range is within a given hour, but, otherwise it's not quite the right way to break this down.
What I'm after though is a way to look at 36 hours worth of flower sales, and for each hour, or minute, see the number of unique flower types sold, but also have the cumulative total.
As I interpret the above example, it's recycling "minutes" (or hours if I use date_hour) so that the 12th minute across two days is only represented by a single value..
Missing something obvious here.....
... | sort _time | streamstats global=f window=0 current=t values(flowers) as v_flowers | eval v_flowers=mvjoin(v_flowers,";") | timechart span=1h dc(flowers) as cur_dc, last(v_flowers) as cum_dc | eval cum_dc=mvcount(split(cum_dc,";"))
The main tricks are (a) you need to sort and get the cumulative count first, and (b) convert the list of items from a multivalue field since it seems that the
last() function doesn't preserve multivalues.
I think I follow the logic here, will have to experiment. I ran into a 10000 row limit on this one though. I did find a "hack" that seems to do it:
...| bin _time span=6h as hour | stats values(flowers) as cflowers dc(flowers) as "Flowers per hour" by hour | streamstats dc(cflowers) as "Cumulative total" | eval Time=strftime(hour, "%m-%d.%H") | table Time, "Flowers per hour", "Cumulative total"
I think you may want this :
... earliest=-36h | timechart span=1h dc(flower) AS "# of flower types purchased" | streamstats sum("# of flower types purchased") AS total
I'm not sure how relevant the cumulative total will be though, as it is only representative of the sum of distinct counts over time, which obviously will have duplicate flower types.
You might find the results of this search more interesting as they will expose similar information :
... earliest=-36h | timechart span=1h count by flower | addtotals | appendcols [search ... | timechart span=1h dc(flower) AS "# of flower types purchased"]
This will provide the number of purchases per flower type, the total number of purchases, and the distinct count of flower types purchased for every hour.
I see. How about :
... earliest=-36h | timechart span=1h count, dc(flower) AS "Distinct count of flower types purchased" | streamstats sum(count) AS "Total flowers purchased" | fields - count
Both of these are interesting, thanks, gives me something to play with. What I was hoping to accomplish though was to have a graph of time on the X axis, number of flowers on the Y, with one line representing the number of unique flowers per that increment of time (hour/minute, whatever) -- but a second line representing the cumulative total over all time, rather than just for that unit of time.
So the second line would never, in theory, go down; if there were 2 unique flowers at noon on Monday, and 4 at 1300, and a total of 5 unique between those times, the second line should be at "5" on Y