Splunk Search

Why are monthly numbers changing when enlarging the timerange?

HeinzWaescher
Motivator

Hi,

my goal is to calculate the number of retained customers per month.

So let's say our timerange starts in 2015-11 and 100 customer logged in during this month. Then I want to know how many of these 100 customers logged in in 2015-12 as well. And so on...
In general: How many users logged in month n and in month n-1?

The final table would look like this

month | active customers | retained customers from previous month
2015-11 | 100 | -
2015-12 | 500 | 25
2016-01 | 1000 | 90

I'm using a search with timerange from 2015-11 to 2016-01 that ends up in correct results.

  sourcetype=logins
| bucket span=1mon _time
| stats count BY account_id, _time
| eventstats values(_time) AS login_months BY customer_id

| eval month=round(relative_time(_time, "@mon"), 0)
| eval month_1=round(relative_time(_time, "-1mon@mon"), 0)

| timechart span=1mon dc(eval(if(_time=month, customer_id, null()))) AS customers,
dc(eval(if(_time=month AND login_months=month_1, customer_id, null()))) AS returning_customers

But using an earlier start date (2015_07 - 2016_01) is also changing the number of retaining customers in 2015-12. This does not make sense and I can't figure out which part of the search is causing this issue.

Thanks in advance

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

sourcetype=logins | bucket span=1mon _time | stats count by account_id, _time 
| streamstats window=2 values(_time) as times by account_id
| eval returning=if(mvindex(times,0)=relative_time(mvindex(times,1),"-1mon"),1,0)
| timechart span=1mon dc(account_id) as customers sum(returning) as returning_customer

View solution in original post

somesoni2
Revered Legend

Give this a try

sourcetype=logins | bucket span=1mon _time | stats count by account_id, _time 
| streamstats window=2 values(_time) as times by account_id
| eval returning=if(mvindex(times,0)=relative_time(mvindex(times,1),"-1mon"),1,0)
| timechart span=1mon dc(account_id) as customers sum(returning) as returning_customer

View solution in original post

HeinzWaescher
Motivator

For some reason I have to round the relative_time part first to make it working:

 sourcetype=logins | bucket span=1mon _time

 | stats count by account_id, _time 
 | streamstats window=2 values(_time) as times by account_id

| eval a=mvindex(times,0)
| eval b=relative_time(mvindex(times,1),"-1mon")
| eval returning_user=if(a=b, 1,0)
| timechart span=1mon dc(account_id) as customers sum(returning) as returning_customer

Thanks a lot!

0 Karma