Solved: how to search for distinct count of active users b...

pgadhari · ‎03-18-2020

I have to show active vpn users at any point of time for e.g. last 15 minutes, last one hour etc.. but these has to be shown based on the user login and logout status, as when I take more time span then the count is not matching, as it is counting the status=login even though the user has logged out. How to resolve this issue ?

My logs have the field userid, status (login and logout), so I have to calculate accordingly. These are the syslogs from Paloalto vpn.

one of the login event :

Mar 18 16:11:57 172.x.x.x 1,2020/03/18 16:11:57,013101002125,USERID,login,2049,2020/03/18 16:11:44,vsys3,10.252.110.43,dca\user1,,0,1,10800,0,0,vpn-client,globalprotect,6691206232784508797,0x8000000000000000,14,18,0,0,3,,2020/03/18 16:11:45,1,0x80000000,user1

logout event :

Mar 18 16:01:53 172.x.x.x 1,2020/03/18 16:01:53,013101002083,USERID,logout,2049,2020/03/18 16:01:42,vsys3,10.192.114.25,user2,,0,1,0,0,0,vpn-client,globalprotect,6691213783337015900,0x8000000000000000,14,18,0,0,,,3,,2020/03/18 16:01:43,1,0x80000000,user2

I can use below query :

index=paloalto sourcetype="pan:log"  status=login | stats dc(userid) as login_count | appendcols [search index=paloalto sourcetype="pan:log"   status=logout | stats dc(userid) as logout_count] | eval active=login_count-logout_count | table login_count logout_count active

Please help ? using the above query the figure is not matching with actual Paloalto active users.

manjunathmeti · ‎03-18-2020

hi @pgadhari,

Try this:

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats dc(eval(status="login")) as login_count, dc(eval(status="logout")) as logout_count by userid | eval active=login_count-logout_count | table userid, login_count, logout_count, active

Another simple way to do this is use latest function in stats command. Check if latest event contains status=login, if yes then it means it's user is active.

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login"

View solution in original post

woodcock · ‎03-29-2020

Like this:

index="paloalto" AND sourcetype="pan:log" 
| streamstats count(eval(status="login")) AS session_id BY user_id
| stats dc(status) AS status_count values(status) AS status BY user_id session_id
| search NOT status="logout"
| stats count

pgadhari · ‎03-29-2020

@woodcock - I am able to get the count using @manjunathmeti answer. But now I am looking for the total user session duration for each user, so do I need to use earliest(_time) and latest(_time) by userid, status to check how long the user has been logged in the VPN (whether in hours, days ). Please advise on that ?

to4kawa · ‎03-29-2020

index=paloalto sourcetype="pan:log"  status=login OR status=logout
| dedup userid
| table _time userid status
| where status="login"
| eventstats count as active_user

Splunk's default search is the latest one at the top.
dedup keep the top.
User with latest status login is active user.

pgadhari · ‎03-29-2020

@to4kawa - Now, I want to find out the avg session time for each user for last 7 days, how can I display that ?

to4kawa · ‎03-30-2020

my answer is for your question how to search for distinct count of active users based on ...
Now, you have another problem. I see.
Let's ask another question.

pgadhari · ‎03-30-2020

Sure I will open new question

ololdach · ‎03-19-2020

Hi, I'm using this approach in my dashboards:

index=paloalto sourcetype="pan:log" | transaction user_id keepevicted=true startswith=status="login" 
endswith=status="logout"
| eval session=if(status="logout","inactive",if(status="login","active","unknown")) | search NOT session="unknown" 
| eval start_time=_time, end_time=if(closed_txn=1,start_time+duration,now()),duration=(end_time-start_time)
| eval bucket_time_size=900  
| eval start_bucket_time = start_time - (start_time % bucket_time_size) 
| eval end_bucket_time = end_time - (end_time % bucket_time_size) 
| eval buckets = mvrange(start_bucket_time,end_bucket_time,bucket_time_size) 
| eval buckets=if(isnull(buckets),_time,buckets) 
| mvexpand buckets 
| eval _time=if(isnull(buckets),_time,buckets) 
| timechart count by session span=15m

Some hints as to what's happening:
First, the transaction will group the events based on user_id. A closed transaction will have a multi-value field status=[login,logout] whereas an open transaction will have a status="login". A session that has started before the time searched and closes within will have a status="logout". Any session that starts before and extends after will fall into the "unknown" category. If you'd like to see them and have the user_id in those events, replace "unknown" with "active" and adjust the _time and duration to reflect the search range (hint:addinfo).
After the prequel, the binning starts. To make it short, my trick is to define a set of time slices (900s in my example) and for every time slice, determine if the event falls into that time slice or not. Set start_time to the transaction _time which is the time of the first event (login). If the transaction is closed, end_time=start_time+duration, else the transaction is still active, hence end_time=now().
Set a bucket_start_time which is the time the bucket starts that the start event lies in. Same with the end. mvrange creates the list of all buckets (_start_times) that the event covers. mvexpand will create one event per bucket. Set the event _time for every "bucketized" event and you are ready to timechart.

Enjoy

pgadhari · ‎03-28-2020

@ololdach - I dont want to use transaction command, as I dont know how long the user will be logged in so it will be difficult to find out the total duration, and also as the data increases, transaction results in slow query performance, is there any other alternative for this ? Please suggest ?

pgadhari · ‎03-21-2020

Thanks. I will check this query and revert.

manjunathmeti · ‎03-18-2020

hi @pgadhari,

Try this:

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats dc(eval(status="login")) as login_count, dc(eval(status="logout")) as logout_count by userid | eval active=login_count-logout_count | table userid, login_count, logout_count, active

Another simple way to do this is use latest function in stats command. Check if latest event contains status=login, if yes then it means it's user is active.

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login"

pgadhari · ‎03-19-2020

the second query seems to be working fine as a single value viz. But, how can I timechart the latest login counts over the period of 1 hr or 2 hrs ? Please suggest ?

manjunathmeti · ‎03-19-2020

Use earliest, For example

To get count for last 15 mins:

index=paloalto sourcetype="pan:log" earliest=-15m  status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login" | stats count as users

To get count for last 1 hour:

index=paloalto sourcetype="pan:log"  earliest=-1h status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login" | stats count as users

pgadhari · ‎03-28-2020

@manjunathmeti - Now, I want to show the duration of the user also, along with logged out users. The requirement is to show how long the users have logged in the VPN the whole day, and whoever is active, I have to show the duration of the user as one of the field in my dashboard. I know that we have to use earliest(_time) and latest(_time) by user, status (login or logout), but somehow I m facing some issues in getting proper duration. Please help ?

pgadhari · ‎03-21-2020

I am asking for timechart instead of stats. ?

pgadhari · ‎03-18-2020

Sure. I will check this and revert.

how to search for distinct count of active users based on status=login and status=logout ?

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation