Given a dummy index/data consisting of the following fields:
sku_number customers_id date_purchase
-------------- --------------- ---------------
1 99 2017-12-01 2:23:33
2 99 2017-12-01 2:23:33
3 104 2018-12-20 2:24:41
1 99 2018-01-20 2:24:41
1 111 2018-01-30 2:24:43
3 200 2018-02-25 2:05:01
3 200 2018-02-27 2:25:01
2 99 2018-03-20 2:24:41
3 200 2018-04-02 2:25:01
2 88 2018-05-12 2:12:01
3 104 2018-06-24 2:25:01
I have been tasked to generate using Splunk monthly counts of (1) new purchases (2) lost purchases within a time window specified by the manager e.g. half-yearly. A new purchase is defined as a product/sku that has not been purchased by a customer in the previous 3 months. A lost purchase is defined as a product/sku that has been purchased previously but 3 months have lapsed since the last purchase. For both counts, each customer for a given product/sku contributes a unique count. If a customer purchases the same sku multiple times within the same month, they count as 1 in the count so the count is closer to a dc(sku-customer) as shown in the code below.
I have searched in vain for a solution that allows me to calculate a rolling time window of counts. With my current Splunk know-how, the best I could do is to perform a count for purchases within a 6 months window based on the data above as follows:
index=customer_purchases date_purchase >= "2018-01-01" AND date_purchase < "2018-06-31"
|eval purchasedate = strftime(strptime(date_purchase,"%Y-%m-%d"),"%Y-%b")
|eval skucust = customers_id . "-" . sku_number
|stats dc(skucust) by purchasedate
Would appreciate advice on how I could look backward 3 months in time to generate the count, firstly for new purchases based on the data above which should produce a table showing results as follows:
month count
-------------- ---------------
2018-Jan 1
2018-Feb 1
2018-Mar 1
2018-Apr 0
2018-May 1
2018-Jun 1
The logic behind the results table is such that for Jan'18, which has two purchases in the data given above, one of them is a repeat purchase from a customer in Dec'17 (assuming no purchases in Nov'17, Oct'17), hence for Jan'18, only 1 purchase is considered in the count. Similarly the count is 0 for Apr'18, even though the data shows customer #200 purchased sku #3, because this is a repeat purchase that occurred within 3 months from the last purchase. Another result to highlight is Mar'18 which shows customer #99 making a repeat purchase of sku #2 on 2018-03-20 2:24:41, this was counted as a new purchase since more than 3 months have lapsed between this purchase and the last purchase of sku #2 by this customer on 2017-12-01 2:23:33.
Secondly, for lost purchases, the approach that I have in mind is to shift the time window requested by the manager 3 months back in time and perform a rolling window count similar to the above. So for example in the sample time window given below from Mar-Jun, the SPL syntax has to examine each months' distinct sku-customer starting from Dec'17 to Apr'18 and perform a count 3-month forward. Hence, based on the data above, although there were 3 purchases in Dec'17, one of the purchases was followed by a repeat purchase on 2018-01-20 2:24:41, so for Mar'18 only 2 were counted as lost purchases. The subsequent two purchases in Jan'18 were not followed by repeat purchases in the following 3 months hence they were counted as lost purchases in Apr.
month count
-------------- ---------------
2018-Mar 2
2018-Apr 2
2018-May 0
2018-Jun 1
Appreciate advice on this. Thanks.
hi @fuwuqi,
Sorry you haven't received any answers to your question. Were you able to come up with a solution on your own? If so, please share it with us, so others can learn from it.
But if not, you should join the 5000+ Splunk users in our public Slack Community chat. People ask each other for immediate help on there daily. You can share your question/link to your post there to see if anyone can take a stab at it.
You first have to request access through https://splk.it/slack Fill out the form, and once you receive the approval email from our Community Manager (usually the approval process may take a couple days), you can access Slack.com and ask for help in the #general channel.