In my Splunk data (say) I've got a running list of customer purchases, with a customer ID number and an Item Number as part of the data. I'd like Splunk to show me sort of a "social networking" view of who is buying what, based upon a single initial purchase. (No, I don't work for Amazon...)
In other words, like to work out all of the combinations and permutations of customers IDs and item numbers and display them all - "Show me all of the purchases made by customers who bought an iPhone, and for those non-iPhone purchases, show me who else bought those items (and so on)"...
So, something like:
1) Start with a single unique Item Number
2) Get a list of associated Customer IDs
3) Get a list of associated Item Numbers that each of the customers in (2) purchased
4) Repeat 2 and 3 until the list stops growing.
5) Display results
I know I can use a sub search to get a complete list of things up through step (3), but I'm looking to go the next step, and for each of the unique Item Numbers returned as part of step 3, to feed it back through Splunk to see which customer IDs purchased those items, and so on.
Eventually, the list will stop "growing" as there are only so many customers and item numbers in my catalogue.
The ideal output would then tell me which item numbers were purchased by which customer IDs as matched in the full data, but all based upon a single initial purchase of a specific item. That's a key part to all this -- as I'm trying to get a picture of who bought what, when, based upon buying behaviour itself.
To achieve your step 4, you need to compute a transitive closure of the purchase relationship among products. For example, if X is the set of products and x R y means "product x purchasers also purchased product y", then the transitive closure of R on X is the relation R+: "we hope product x purchasers have some interest in product y." See Wikipedia for more information.
You could create a custom search command to compute the transitive closure.
I would guess that the transitive closure would grow to include most products and customers if:
You sell the same product to many different customers.
Many customers buy a variety of products.
You can explore several iterations using Splunk lookup tables. First, create a lookup table that records all of the purchases (mapping your customer and product ID numbers):