Hello!
I have the following query with the provided fields to track consumption data for customers.
action=load OR action=Download customer!="" publicationId="*" topic="*"
| eval Month=strftime(_time, "%b-%y")
| stats count by customer, Month, product, publicationId, topic
| streamstats count as product_rank by customer, Month
| where product_rank <= 5
| table customer, product, publicationId, topic, count, Month
However, I do not believe it is achieving what I aim for. The data is structured as follows: Products > Publication IDs within those products > Topics within those specific publication IDs. What I am trying to accomplish is find out the top 5 products per customer per month, and then for each of those 5 products find out the top 5 publicationIds within them, and then for each publicationID find out the top 5 topics within them.
Try something like this (essentially, you need to calculate each "top 5" and eliminate the stats events for each level, before calculating the next "top 5" for the next level).
action=load OR action=Download customer!="" publicationId="*" topic="*"
| eval Month=strftime(_time, "%b-%y")
| stats count by customer, Month, product, publicationId, topic
| eventstats sum(count) as product_count by customer Month product
| sort 0 customer Month -product_count
| streamstats dc(product) as product_rank by customer, Month
| where product_rank <= 5
| eventstats sum(count) as publicationId_count by customer Month product publicationId
| sort 0 customer Month product -publicationId_count
| streamstats dc(publicationId) as publicationId_rank by customer Month product
| where publicationId_rank <= 5
| eventstats sum(count) as topic_count by customer Month product publicationId topic
| sort 0 customer Month product publicationId -topic_count
| streamstats dc(topic) as topic_rank by customer Month product publicationId
| where topic_rank <= 5
| table customer, product, publicationId, topic, count, Month
Try something like this (essentially, you need to calculate each "top 5" and eliminate the stats events for each level, before calculating the next "top 5" for the next level).
action=load OR action=Download customer!="" publicationId="*" topic="*"
| eval Month=strftime(_time, "%b-%y")
| stats count by customer, Month, product, publicationId, topic
| eventstats sum(count) as product_count by customer Month product
| sort 0 customer Month -product_count
| streamstats dc(product) as product_rank by customer, Month
| where product_rank <= 5
| eventstats sum(count) as publicationId_count by customer Month product publicationId
| sort 0 customer Month product -publicationId_count
| streamstats dc(publicationId) as publicationId_rank by customer Month product
| where publicationId_rank <= 5
| eventstats sum(count) as topic_count by customer Month product publicationId topic
| sort 0 customer Month product publicationId -topic_count
| streamstats dc(topic) as topic_rank by customer Month product publicationId
| where topic_rank <= 5
| table customer, product, publicationId, topic, count, Month