I have been tasked to generate using Splunk monthly counts of (1) new purchases (2) lost purchases within a time window specified by the manager e.g. half-yearly. A new purchase is defined as a product/sku that has not been purchased by a customer in the previous 3 months. A lost purchase is defined as a product/sku that has been purchased previously but 3 months have lapsed since the last purchase. For both counts, each customer for a given product/sku contributes a unique count. If a customer purchases the same sku multiple times within the same month, they count as 1 in the count so the count is closer to a dc(sku-customer) as shown in the code below.
I have searched in vain for a solution that allows me to calculate a rolling time window of counts. With my current Splunk know-how, the best I could do is to perform a count for purchases within a 6 months window based on the data above as follows:
The logic behind the results table is such that for Jan'18, which has two purchases in the data given above, one of them is a repeat purchase from a customer in Dec'17 (assuming no purchases in Nov'17, Oct'17), hence for Jan'18, only 1 purchase is considered in the count. Similarly the count is 0 for Apr'18, even though the data shows customer #200 purchased sku #3, because this is a repeat purchase that occurred within 3 months from the last purchase. Another result to highlight is Mar'18 which shows customer #99 making a repeat purchase of sku #2 on 2018-03-20 2:24:41, this was counted as a new purchase since more than 3 months have lapsed between this purchase and the last purchase of sku #2 by this customer on 2017-12-01 2:23:33.
Secondly, for lost purchases, the approach that I have in mind is to shift the time window requested by the manager 3 months back in time and perform a rolling window count similar to the above. So for example in the sample time window given below from Mar-Jun, the SPL syntax has to examine each months' distinct sku-customer starting from Dec'17 to Apr'18 and perform a count 3-month forward. Hence, based on the data above, although there were 3 purchases in Dec'17, one of the purchases was followed by a repeat purchase on 2018-01-20 2:24:41, so for Mar'18 only 2 were counted as lost purchases. The subsequent two purchases in Jan'18 were not followed by repeat purchases in the following 3 months hence they were counted as lost purchases in Apr.
Sorry you haven't received any answers to your question. Were you able to come up with a solution on your own? If so, please share it with us, so others can learn from it.
But if not, you should join the 5000+ Splunk users in our public Slack Community chat. People ask each other for immediate help on there daily. You can share your question/link to your post there to see if anyone can take a stab at it.
You first have to request access through https://splk.it/slack Fill out the form, and once you receive the approval email from our Community Manager (usually the approval process may take a couple days), you can access Slack.com and ask for help in the #general channel.