I was asked to provide active users since December 2016. With the logs we're working with, there's really no way to get a number of "active users"; someone at my branch suggested the best way would be to get a number for distinct users, but there would be no way to distinguish between users who logged into the system once and those who log in regularly.
My proposed solution is to provide the distinct user count but to only count users that appear in 5 or more events (I was told I can assume 5 events is enough to consider them "active"). Is there a way to do this or would anyone have some advice? What I have so far is just the distinct users part, but I haven't been able to figure out what limits the distinct user count based on number of events per distinct user.
sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats dc(UID) as distinct_users
using @somesoni2 search for dc as well if needed
sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats count as active_users by UID | where active_users >=5
| stats dc(UID) as distinct_active_users
using @somesoni2 search for dc as well if needed
sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats count as active_users by UID | where active_users >=5
| stats dc(UID) as distinct_active_users
I tested this comparing the results to my basic distinct users search and then just running your query up to the end of the first stats command. It appears to work! Thank you very much.
maybe this can work:
sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats count as active_users by UID | where count >=5 | sort -count
I'm unable to parse the uri_path= UID=
part.
hmmm, i used the same base search you shared in question, the only change i made is after stats:
... | stats count as active_users by UID | where count >=5 | sort -count
Your search in question gives you a single row-column with distinct count of users. If your desirable output is the same, just add your stats command to end of @adonio's search.
sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats count as active_users by UID | where count >=5 | stats dc(UID) as distinct_active_users
@adonio Sorry about that, I must have pasted the search incorrectly.So far I'm not getting any results with your actual search though.
somesoni2, your search gives me the "distinct_active_users" bar but the results just stay at 0. I tried some other variations of my search but it remains the same.
does your base search produces results?
is UID a field that already exists and has valid values?
Yes to both.
Maybe I'm off the mark but I think the error might be in the initial stats search. Correct me if I'm wrong, but doesn't stats count as active_users by UID
need something in front of count? Or is there a default field it uses if you don't specify anything in brackets such as count(UID) as distinct_user by _time
?
What results you get when you run your search?? It's strange, the only thing @adonio's search is using is field UID, which seems to be in same letter-case as yours.
sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats dc(UID) as distinct_users
Actually, it's the where count >= 5
that is not working for me. sourcetype=access* uri_path=* NOT eventtype=Web* |fillnull value="Missing Field"| stats count as active_users by UID
works but it gives me everything.
my bad, must be the time and day
thee where is to be by active_users since i stated the count to be named active users
so: ... | stats count as active_users by UID | where active_users >=5
or: ... | stats count by UID | where count >=5
ill put it in an answer.