I would like to group the daily users by their number of active days during the last 2 weeks. My current search looks like this:
... | bucket span=1d _time | stats dc(user_id) AS count BY user_id, _time | sort 0 + _time | streamstats window=14 dc(_time) AS active_days BY user_id | timechart span=1d dc(user_id) BY active_days
But this calculation does not include days of inactivity (because of the stats command) and ends up incorrect results.
How can I achieve my goal?
... | timechart span=1d count BY user_id | streamstats window=14 count(eval(count>0)) AS active_days BY user_id | timechart span=1d dc(user_id) BY active_days
thanks for your answer. But I think the streamstats doesn't like the "timechart BY" before. It is not adding the active_days field
... | timechart span=1d count BY user_id | untable _time user_id count | streamstats window=14 count(eval(count>0)) AS active_days BY user_id | timechart span=1d dc(user_id) BY active_days
thanks a lot for coming back to my problem.
I think we have to do some small adjustments, then this approach shows complete & correct results:
| timechart span=1d count BY userid limit=10000000
| untable _time userid count
| sort 0 + account_id, _time
| streamstats window=14 global=false count(eval(count>0)) AS activedays BY userid
| timechart span=1d dc(eval(if(count>0, accountid, null()))) BY activedays
The sort command should have the same effect on the results like global=false, but I hope it helps to improve the peformance of streamstats.
Is there an option that tells a timechart to set the limit=infinite ?
sort portion is destroying the function of the search. The results must remain sorted primarily by
_time or the
streamstats part will not work correctly so definitely remove the
sort line. That is a very important catch on the
global=false part though!!! Usually
limit=0 disables the limit entirely but I have not tried it on
There is no difference, when I remove the sort line. And I don't get how this could destroy the function.
In the end the streamstats command has to analyse a window of 14 days per user_id. After the sort line, the table is so well prepared, that I could even delete the global=false command.
I think I've found a workaround. But it's veeeery expensive so I'm not really happy with it.
| bucket span=1d _time | stats dc(account_id) AS dc by user_id_1, _time | eventstats values(account_id) AS user_id_2 | mvexpand user_id_2 | stats count(eval(if(user_id_1=user_id_2, user_id_1, null()))) AS active BY user_id_2, _time | sort 0 + user_id_2, _time | streamstats window=14 global=f dc(eval(if(active=1, _time, null()))) AS active_days BY user_id_2 | search active=1 | timechart span=1d dc(user_id_2) BY active_days limit=14