Hi, I have a CSV with something like the one shown. first field is order id and second field is product code.
ordr prdcd a 1 a 2 b 3 c 4 c 1 c 5 d 6 d 7 e 2 f 1 g 1 g 5
Ask- Look at order id 'a', it has 3 related products 1, 2 & 3. Ask is to find out if an order has a product , then what else could be the products that are usually purchased along with the same product?
For example, order ids c & g also have product code 1 , in addition both c and g order codes have linked product code 5, so for order id a, and product code 1 (first row) , we should be able to suggest to include product code 5 to be bundled / offered. However, when we come to the second row, there should be 0/no recommendations since the product code 2 is only being used in order id e and e in itself does not have any other linked product codes.
Another example are rows with order id f and order id c(row order id c product code 1). Here the recommendation should be 50% of order ids having product code 1 also have product code 5 AND 25% of order ids having product code 1 also have product code 2 & 3. Since order id a has linked products 2 & 3
I need this kind of recommendations in a third column for each row
Try appending this to the search that generates the data you have above:
| stats values(prdcd) as itemsordered by ordr | stats values(itemsordered) as recommendations by itemsordered | rename itemsordered as prdcd | nomv recommendations
Without seeing your original source it would be difficult to say for sure, however you would most likely be best to write this to a lookup file (using outputlookup) and keep the file updated via a scheduled search, that way you will be able to add an auto lookup to your original search.
Hi @sashraf thanks for the reply.
I had reached until this point earlier, I guess the stats is the easier part. However, the difficulty is I was not able to add a split by clause to get a percentage of the top linked products.
For example , in my above example look at product 1, it is present in order types a,c,f,g. Now, look at the OTHER products (except product 1) in these 4 orders. 'a' has product type 2 in addition to product type 1, 'c' has product types 4& 5 linked in addition to type 1, 'f' has no other products except type 1, lastly 'g' has type 5 linked other than one.
Hence, the expected recommendation in this case should be something like - 50% of order ids having product code 1 also have product code 5 AND 25% of order ids having product code 1 also have product code 2 & 3. Since order id a has linked products 2 & 4. I am not too worried about the text, but I do need an associated occurrence based count/percentage for the linked products, something like the preceding statement or at least something like
prd ordr linkedprd linkedprdcount
1 a 5 2
c 2 1
f 3 1