I am starting with this query to show which types of products our top customers buy
``` get all purchases ```
index=customer_data action=purchase
``` narrow down to purchases from our top 100 customers (based on purchase count) ```
[search index=customer_data action=purchase top limit=100 user_id | table user_id]
``` show type of products that each user purchased ```
| stats count, list(product_category) by user_id
This will give some output like this:
user_id | values(product_category)
------------------------------------
1234 | clothing, clothing, clothing, food
2345 | electronics, electronics, food, food
I need one additional piece of information: the ratio of the product categories, for each user. I need this output instead. The format of the distributions is not important
user_id | values(product_category)
------------------------------------
1234 | clothing=75%, food=25%
2345 | electronics=50%, food=50%
The best idea I have thus far is something like this:
``` get all purchases ```
index=customer_data action=purchase
``` narrow down to purchases from our top 100 customers (based on purchase count) ```
[search index=customer_data action=purchase top limit=100 user_id | table user_id]
``` show type of products that each user purchased ```
| stats count, list(product_category) by user_id
| eval product_category_distribution = ???
| table user_id, product_category_distribution
but I can't find any such function ...
You count by category first, then device this count by total.
index=customer_data action=purchase
| top limit=100 user_id
``` the above should give you all purchases by top 100 users ```
| stats count by user_id product_category
| eventstats sum(count) as total by user_id
| eval percent = round(100 * count / total, 2)
| stats values(eval(product_category . "=" . percent . "%")) as product_category by user_id
| eval product_category = mvjoin(product_category)
Here, I fail to see why you need a subsearch to pick out top 100 when the top 100 command should suffice. But if that is needed, just add it to the top.
``` get all purchases ```
index=customer_data action=purchase
``` narrow down to purchases from our top 100 customers (based on purchase count) ```
[search index=customer_data action=purchase top limit=100 user_id | table user_id]
``` show type of products that each user purchased ```
| stats count, list(product_category) as product_category by user_id
| stats values(count) as total count by user_id product_category
| eval product_category=product_category."=".round(100*count/total,2)."%"
| stats values(product_category) as product_category by user_id
| eval product_category=mvjoin(product_category,", ")