Splunk Search

How do I create a rolling time window which counts forward and backward?

fuwuqi
Engager

Given a dummy index/data consisting of the following fields:

sku_number      customers_id      date_purchase
 --------------  ---------------  ---------------
 1               99               2017-12-01 2:23:33
 2               99               2017-12-01 2:23:33
 3               104              2018-12-20 2:24:41
 1               99               2018-01-20 2:24:41
 1               111              2018-01-30 2:24:43
 3               200              2018-02-25 2:05:01
 3               200              2018-02-27 2:25:01
 2               99               2018-03-20 2:24:41
 3               200              2018-04-02 2:25:01
 2               88               2018-05-12 2:12:01
 3               104              2018-06-24 2:25:01

I have been tasked to generate using Splunk monthly counts of (1) new purchases (2) lost purchases within a time window specified by the manager e.g. half-yearly. A new purchase is defined as a product/sku that has not been purchased by a customer in the previous 3 months. A lost purchase is defined as a product/sku that has been purchased previously but 3 months have lapsed since the last purchase. For both counts, each customer for a given product/sku contributes a unique count. If a customer purchases the same sku multiple times within the same month, they count as 1 in the count so the count is closer to a dc(sku-customer) as shown in the code below.

I have searched in vain for a solution that allows me to calculate a rolling time window of counts. With my current Splunk know-how, the best I could do is to perform a count for purchases within a 6 months window based on the data above as follows:

index=customer_purchases date_purchase >= "2018-01-01" AND date_purchase < "2018-06-31"
 |eval purchasedate = strftime(strptime(date_purchase,"%Y-%m-%d"),"%Y-%b")
 |eval skucust = customers_id . "-" . sku_number
 |stats dc(skucust) by purchasedate

Would appreciate advice on how I could look backward 3 months in time to generate the count, firstly for new purchases based on the data above which should produce a table showing results as follows:

month          count     
 --------------  ---------------
 2018-Jan              1
 2018-Feb              1
 2018-Mar              1
 2018-Apr              0
 2018-May              1
 2018-Jun              1

The logic behind the results table is such that for Jan'18, which has two purchases in the data given above, one of them is a repeat purchase from a customer in Dec'17 (assuming no purchases in Nov'17, Oct'17), hence for Jan'18, only 1 purchase is considered in the count. Similarly the count is 0 for Apr'18, even though the data shows customer #200 purchased sku #3, because this is a repeat purchase that occurred within 3 months from the last purchase. Another result to highlight is Mar'18 which shows customer #99 making a repeat purchase of sku #2 on 2018-03-20 2:24:41, this was counted as a new purchase since more than 3 months have lapsed between this purchase and the last purchase of sku #2 by this customer on 2017-12-01 2:23:33.

Secondly, for lost purchases, the approach that I have in mind is to shift the time window requested by the manager 3 months back in time and perform a rolling window count similar to the above. So for example in the sample time window given below from Mar-Jun, the SPL syntax has to examine each months' distinct sku-customer starting from Dec'17 to Apr'18 and perform a count 3-month forward. Hence, based on the data above, although there were 3 purchases in Dec'17, one of the purchases was followed by a repeat purchase on 2018-01-20 2:24:41, so for Mar'18 only 2 were counted as lost purchases. The subsequent two purchases in Jan'18 were not followed by repeat purchases in the following 3 months hence they were counted as lost purchases in Apr.

month          count     
 --------------  ---------------
 2018-Mar              2
 2018-Apr              2
 2018-May              0
 2018-Jun              1

Appreciate advice on this. Thanks.

Tags (3)
0 Karma

mstjohn_splunk
Splunk Employee
Splunk Employee

hi @fuwuqi,

Sorry you haven't received any answers to your question. Were you able to come up with a solution on your own? If so, please share it with us, so others can learn from it.

But if not, you should join the 5000+ Splunk users in our public Slack Community chat. People ask each other for immediate help on there daily. You can share your question/link to your post there to see if anyone can take a stab at it.

You first have to request access through https://splk.it/slack Fill out the form, and once you receive the approval email from our Community Manager (usually the approval process may take a couple days), you can access Slack.com and ask for help in the #general channel.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...