Suppose my log entries resembled:
Rick ate a cheeseburger Tony ate a grape Rick ate a frenchfry Tony ate a grape Rick ate a cheeseburger Sally ate a salad ...
So I have two fields of interest "name" and "food".
Now, I'd like to know which user eats the most different kinds of food.
I believe the associate command can be used to tell me which users are most likely to eat which foods. What I'd rather find is almost the opposite. Given a key, how many different (unpredictable) values is it paired with?
Then I could do things like send an alert "RicksDog ate 7 different kinds of food in the past 24 hours - he's going to be sick!".
I can easily do this with a script. I'd like to do it in Splunk, if possible.
Assuming that you've extracted "name" and "food", you can search:
... | stats dc(food) as food_count values(food) as foods by name | sort - food_count
This will give you a sorted list of users by number of food types.
You can also use "eventstats" to calculate the average number of foods if you're unsatisfied with absolute thresholds (fixed counts or top n).