Splunk Search

How do I compare a count of events in a given date_month by foo, over the last X years?

kknopp
Path Finder

I currently have a search that uses 3 join statements to get me the criteria I'm looking for over the last three years. Essentially it looks like:

earliest=@year latest=now EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as newUsers by foo | join foo [search earliest=-1year@year latest=@year EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as olderUsers by foo ] | join foo [search earliest=-2year@year latest=-1year@year EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as oldestUsers by foo ] | table foo oldestUsers olderUsers newUsers

This gives me a lovely visualization with 3 bars per foo, each the count of users that logged in to our software in January of this year, last year, and 2014. The problem is twofold: (1) it takes forever to run, and more importantly, (2) If user John Smith didn't log in January of this year, there is nothing to join on and compare logins from last year or the year before. Therefore, I'm missing data. When I timechart the above, I don't get the visualization I want, but my data is accurate. I also tried using append and timewrap, neither of which got me what I'm looking for.

This is what the append and a timechart gets me the data by foo, but in 3 groups of bars, 1 group per year.

This is what the timewrap gets me:
timewrap_chart

This is what my join gets me (this is last month compared to this month, and client_type=foo in my above example). This is what I want to see, but I need a method that gets me a complete dataset. Thoughts?
join_chart

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try.

earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | eval type=case(_time>=relative_time(now(),"@y"),"newUsers",_time>=relative_time(now(),"-1y@y"),"olderUsers",,true(),"oldestUsers") | chart  dc(USERNAME) as Users by foo type 

View solution in original post

somesoni2
Revered Legend

Give this a try.

earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | eval type=case(_time>=relative_time(now(),"@y"),"newUsers",_time>=relative_time(now(),"-1y@y"),"olderUsers",,true(),"oldestUsers") | chart  dc(USERNAME) as Users by foo type 

kknopp
Path Finder

Perfect! This is exactly what I was looking for. Thank you so much!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

My first intuition is try outer joins (join type=outer ...) to get the missing data.

My second intuition is to eliminate the joins. Perhaps this will help:

earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | stats dc(USERNAME) as Users by foo date_year | table foo Users

or

earliest=-2@year latest=now EVENT_TYPE=Login date_month=january | bin span=12mon | stats dc(USERNAME) as Users by foo | table foo Users
---
If this reply helps you, Karma would be appreciated.
0 Karma

kknopp
Path Finder

Thank you for this. I had to make some tweaks, but ultimately couldn't get the chart I was looking for. I needed to see Users per year, grouped by foo. The response below ultimately got what I was looking for, but I appreciate the help!

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...