Hello all,
I'm trying to get the stats commands to work in chain. I have the following data:
08 January 2016 09:10:10 website=abc.com, user=user1, message=blahblah1
08 January 2016 09:10:11 website=abc.com, user=user1, message=blahblah2
08 January 2016 09:10:12 website=abc.com, user=user1, message=blahblah2x
08 January 2016 09:10:13 website=abc.com, user=user1, message=blahblah2xxx
08 January 2016 09:10:14 website=abc.com, user=user1, message=blahblah2xxx
08 January 2016 09:10:15 website=abc.com, user=user1, message=blahblah2xxxxx
08 January 2016 09:10:16 website=abc.com, user=user2, message=blahblah3x
08 January 2016 09:10:17 website=abc.com, user=user2, message=blahblah3xx
08 January 2016 09:10:18 website=abc.com, user=user2, message=blahblah3
08 January 2016 09:10:19 website=abc.com, user=user3, message=blahblah4
08 January 2016 09:10:20 website=def.com, user=user1, message=blahblah5
08 January 2016 09:10:21 website=def.com, user=user2, message=blahblah6
08 January 2016 09:10:22 website=def.com, user=user2, message=blahblah7
08 January 2016 09:10:23 website=def.com, user=user2, message=blahblah8
08 January 2016 09:10:24 website=xyz.com, user=user3, message=blahblah9
And I want to get the Top 2 websites listing for only Top 2 users per website; hence the following table output is what I'm trying to get:
Website User
abc.com user1
abc.com user2
def.com user2
def.com user1
I was looking into foreach command, but that does not allow me to use top/stats. E.g:
… | top limit=2 website | foreach website [ search website=<<FIELD>> | top limit=2 user ]
I’d appreciate any feedback.
I would use a subsearch:
# Top websites for top 2 users
yoursearch [ yoursearch | top limit=2 user | table user]
| top limit=0 website, user
# Top users for top 2 websites
yoursearch [ yoursearch | top limit=2 website | table website]
| top limit=0 website, user
| makeresults
| eval data="
08 January 2016 09:10:10 website=abc.com, user=user1, message=blahblah1;
08 January 2016 09:10:11 website=abc.com, user=user1, message=blahblah2;
08 January 2016 09:10:12 website=abc.com, user=user1, message=blahblah2x;
08 January 2016 09:10:13 website=abc.com, user=user1, message=blahblah2xxx;
08 January 2016 09:10:14 website=abc.com, user=user1, message=blahblah2xxx;
08 January 2016 09:10:15 website=abc.com, user=user1, message=blahblah2xxxxx;
08 January 2016 09:10:16 website=abc.com, user=user2, message=blahblah3x;
08 January 2016 09:10:17 website=abc.com, user=user2, message=blahblah3xx;
08 January 2016 09:10:18 website=abc.com, user=user2, message=blahblah3;
08 January 2016 09:10:19 website=abc.com, user=user3, message=blahblah4;
08 January 2016 09:10:20 website=def.com, user=user1, message=blahblah5;
08 January 2016 09:10:21 website=def.com, user=user2, message=blahblah6;
08 January 2016 09:10:22 website=def.com, user=user2, message=blahblah7;
08 January 2016 09:10:23 website=def.com, user=user2, message=blahblah8;
08 January 2016 09:10:24 website=xyz.com, user=user3, message=blahblah9"
| makemv data delim=";"
| mvexpand data
| rex field=data "(?<Timestamp>\d+\s\w+\s\d+\s\d+:\d+:\d+)\s[^\s]+=(?<website>[^\s]+),\s[^\s]+=(?<user>[^\s]+),\s[^\s]+=(?<message>[^\s]+)"
| table website user
| top 2 user by website showcount=f showperc=f
| rename user as User, website as Website
If you have always four records finally ie 2 websites * 2 users , try this .
<search> |table user website |top limit=2 user by website|head 4
It can't be that simple i believe. I should be missing something 🙂
I would use a subsearch:
# Top websites for top 2 users
yoursearch [ yoursearch | top limit=2 user | table user]
| top limit=0 website, user
# Top users for top 2 websites
yoursearch [ yoursearch | top limit=2 website | table website]
| top limit=0 website, user
Thank you javiergn. This helped me find out the solution as the following:
mysearchstring [ mysearchstring | top limit=2 website | table website ] | stats count by website,user | sort +website,-count | dedup 2 website
first limit
is for top websites and limiting the dedup
is for top users per website.
I wanted to add another quick comment on the solution. The one I put up there did not properly sort the results. I wanted the results to be grouped by websites (top websites by count or another field value, eg. bandwidth). I tried something like the following (2 subsearches):
mysearchstring [ mysearchstring | top limit=2 website | table website ] [search [ mysearchstring | top limit=2 website | table website ] | stats count by user | sort 2 -count| table user] | stats count by website,user
But this also does not group properly and seems costly. In order to sort, I had to add a sort_field
to each event and then use that. For this, I used eventstats
to compute a new field and use the average of that when evaluating sort_field
.
mysearchstring [ mysearchstring | top limit=2 website | table website ] | eventstats count AS sortcount by website | stats count, avg(sortcount) AS sort_field by website,user | dedup 2 website | sort -sort_field | fields website, user