I feel I need a sub-search, but...

bowesmana · ‎12-09-2013

I have 2 sourcetypes

logins - contain login events for learners
learners - contains the learners

The learners are in different companies. I want to show the number of unique users who logged in by company as a percentage of the total number of learners

I can get the learner count per company with

sourcetype=learners* | eventstats dc(Username) as maxUsers by Co_Name

but I want to then count the number of unique logins for that company, which I can do with

sourcetype=logins* | stats dc(Username) as Users by Co_Name

but I want to then show

Co_Name, Users, maxUsers, %ofMaxUsers

and I'm not sure how to combine the two values and report the %

kristian_kolb · ‎12-09-2013

The 'problem' is the fact that the username field is the same in both sourceypes. One approach is to change this;

| multisearch [search sourcetype=logins | rename Username as login] [search sourcetype=learners] | stats dc(Username) as maxUsers dc(login) as logins by Co_name | eval "%maxUsers" = logins/maxUsers*100

This works fine - even for scenarios where there are potential learners who never logged on.

/K

kristian_kolb · ‎12-09-2013

Yeah, well, while the join will work for all Co_names that have at least on login, it's about the most 'expensive' operation you can do. I also tried a few times with stats/eventstats combos before realizing that you can not get it to work, due to the inability to separate the usernames from each sourcetype.

NOTE: normally it's a good thing to have the same field name for the same type of data across sourcetypes, but for this particular search it created a small problem.

Please mark the question as answered if it solved your problem, thanks.

/K

bowesmana · ‎12-09-2013

This works and is instant compared with the quite long JOIN version above, which takes many seconds.

somesoni2 · ‎12-09-2013

Option 1:

sourcetype=logins* | stats dc(Username) as Users by Co_Name | appendcols [search sourcetype=learners* | eventstats dc(Username) as maxUsers by Co_Name] | eval "%ofMaxUser"=(Users*100)/maxUsers

Option 2:

sourcetype=logins* | stats dc(Username) as Users by Co_Name | join Co_Name [search sourcetype=learners* | eventstats dc(Username) as maxUsers by Co_Name] | eval "%ofMaxUser"=(Users*100)/maxUsers

bowesmana · ‎12-09-2013

Hi somesoni2. The appendcols version did not work, not sure what it was giving me, but the maxUsers value was wrong. The JOIN variant worked, but is very slow. It takes 6 seconds, compared to Kristian's multisearch below.

kristian_kolb · ‎12-09-2013

neither of these handle scenarios where no users from a company ever logged in. and you'd probably want to add

| fields + Co_name, Users, maxUsers, "%ofMaxUsers"

at the end

alacercogitatus · ‎12-09-2013

Edited - Try this:

sourcetype=learners* OR sourcetype=logins* | eventstats dc(Username) as maxUsers | stats dc(Username) as Users max(maxUsers) as maxUsers by Co_Name | eval PercentOfMax = (Users / maxUsers) * 100

bowesmana · ‎12-09-2013

Thanks guys. With "eventstats dc(Username) as maxUsers" it does not split maxUsers by Co, so I put "by Co_Name", which gives the right maxUsers, but then Users=maxUsesr all the time. dc(Username) in learners is different to dc(Username) in logins.

kristian_kolb · ‎12-09-2013

I'm still not sure that this will make sense either, you lose the 'by company'-max value. It will be the grand total across all companies.

alacercogitatus · ‎12-09-2013

yeah, oops. I removed the by clause in the eventstats - should be correct now.

kristian_kolb · ‎12-09-2013

will this not lead to Users = maxUsers at all times...?

I feel I need a sub-search, but...

Video | Welcome Back to Smartness, Pedro

Detector Best Practices: Static Thresholds

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...