I currently have a search that uses 3 join statements to get me the criteria I'm looking for over the last three years. Essentially it looks like:
earliest=@year latest=now EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as newUsers by foo | join foo [search earliest=-1year@year latest=@year EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as olderUsers by foo ] | join foo [search earliest=-2year@year latest=-1year@year EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as oldestUsers by foo ] | table foo oldestUsers olderUsers newUsers
This gives me a lovely visualization with 3 bars per foo, each the count of users that logged in to our software in January of this year, last year, and 2014. The problem is twofold: (1) it takes forever to run, and more importantly, (2) If user John Smith didn't log in January of this year, there is nothing to join on and compare logins from last year or the year before. Therefore, I'm missing data. When I timechart the above, I don't get the visualization I want, but my data is accurate. I also tried using append
and timewrap
, neither of which got me what I'm looking for.
This is what the append and a timechart gets me the data by foo, but in 3 groups of bars, 1 group per year.
This is what the timewrap gets me:
This is what my join gets me (this is last month compared to this month, and client_type=foo in my above example). This is what I want to see, but I need a method that gets me a complete dataset. Thoughts?
Give this a try.
earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | eval type=case(_time>=relative_time(now(),"@y"),"newUsers",_time>=relative_time(now(),"-1y@y"),"olderUsers",,true(),"oldestUsers") | chart dc(USERNAME) as Users by foo type
Give this a try.
earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | eval type=case(_time>=relative_time(now(),"@y"),"newUsers",_time>=relative_time(now(),"-1y@y"),"olderUsers",,true(),"oldestUsers") | chart dc(USERNAME) as Users by foo type
Perfect! This is exactly what I was looking for. Thank you so much!
My first intuition is try outer joins (join type=outer ...
) to get the missing data.
My second intuition is to eliminate the joins. Perhaps this will help:
earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as Users by foo date_year | table foo Users
or
earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | bin span=12mon | stats dc(USERNAME) as Users by foo | table foo Users
Thank you for this. I had to make some tweaks, but ultimately couldn't get the chart I was looking for. I needed to see Users per year, grouped by foo. The response below ultimately got what I was looking for, but I appreciate the help!