Splunk Search

how to search for distinct count of active users based on status=login and status=logout ?

pgadhari
Builder

I have to show active vpn users at any point of time for e.g. last 15 minutes, last one hour etc.. but these has to be shown based on the user login and logout status, as when I take more time span then the count is not matching, as it is counting the status=login even though the user has logged out. How to resolve this issue ?

My logs have the field userid, status (login and logout), so I have to calculate accordingly. These are the syslogs from Paloalto vpn.

one of the login event :

Mar 18 16:11:57 172.x.x.x 1,2020/03/18 16:11:57,013101002125,USERID,login,2049,2020/03/18 16:11:44,vsys3,10.252.110.43,dca\user1,,0,1,10800,0,0,vpn-client,globalprotect,6691206232784508797,0x8000000000000000,14,18,0,0,3,,2020/03/18 16:11:45,1,0x80000000,user1

logout event :

Mar 18 16:01:53 172.x.x.x 1,2020/03/18 16:01:53,013101002083,USERID,logout,2049,2020/03/18 16:01:42,vsys3,10.192.114.25,user2,,0,1,0,0,0,vpn-client,globalprotect,6691213783337015900,0x8000000000000000,14,18,0,0,,,3,,2020/03/18 16:01:43,1,0x80000000,user2

I can use below query :

index=paloalto sourcetype="pan:log"  status=login | stats dc(userid) as login_count | appendcols [search index=paloalto sourcetype="pan:log"   status=logout | stats dc(userid) as logout_count] | eval active=login_count-logout_count | table login_count logout_count active

Please help ? using the above query the figure is not matching with actual Paloalto active users.

0 Karma
1 Solution

manjunathmeti
Champion

hi @pgadhari,

Try this:

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats dc(eval(status="login")) as login_count, dc(eval(status="logout")) as logout_count by userid | eval active=login_count-logout_count | table userid, login_count, logout_count, active

Another simple way to do this is use latest function in stats command. Check if latest event contains status=login, if yes then it means it's user is active.

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login"

View solution in original post

0 Karma

woodcock
Esteemed Legend

Like this:

index="paloalto" AND sourcetype="pan:log" 
| streamstats count(eval(status="login")) AS session_id BY user_id
| stats dc(status) AS status_count values(status) AS status BY user_id session_id
| search NOT status="logout"
| stats count

pgadhari
Builder

@woodcock - I am able to get the count using @manjunathmeti answer. But now I am looking for the total user session duration for each user, so do I need to use earliest(_time) and latest(_time) by userid, status to check how long the user has been logged in the VPN (whether in hours, days ). Please advise on that ?

0 Karma

to4kawa
Ultra Champion
index=paloalto sourcetype="pan:log"  status=login OR status=logout
| dedup userid
| table _time userid status
| where status="login"
| eventstats count as active_user

Splunk's default search is the latest one at the top.
dedup keep the top.
User with latest status login is active user.

pgadhari
Builder

@to4kawa - Now, I want to find out the avg session time for each user for last 7 days, how can I display that ?

0 Karma

to4kawa
Ultra Champion

my answer is for your question how to search for distinct count of active users based on ...
Now, you have another problem. I see.
Let's ask another question.

0 Karma

pgadhari
Builder

Sure I will open new question

0 Karma

ololdach
Builder

Hi, I'm using this approach in my dashboards:

index=paloalto sourcetype="pan:log" | transaction user_id keepevicted=true startswith=status="login" 
endswith=status="logout"
| eval session=if(status="logout","inactive",if(status="login","active","unknown")) | search NOT session="unknown" 
| eval start_time=_time, end_time=if(closed_txn=1,start_time+duration,now()),duration=(end_time-start_time)
| eval bucket_time_size=900  
| eval start_bucket_time = start_time - (start_time % bucket_time_size) 
| eval end_bucket_time = end_time - (end_time % bucket_time_size) 
| eval buckets = mvrange(start_bucket_time,end_bucket_time,bucket_time_size) 
| eval buckets=if(isnull(buckets),_time,buckets) 
| mvexpand buckets 
| eval _time=if(isnull(buckets),_time,buckets) 
| timechart count by session span=15m

Some hints as to what's happening:
First, the transaction will group the events based on user_id. A closed transaction will have a multi-value field status=[login,logout] whereas an open transaction will have a status="login". A session that has started before the time searched and closes within will have a status="logout". Any session that starts before and extends after will fall into the "unknown" category. If you'd like to see them and have the user_id in those events, replace "unknown" with "active" and adjust the _time and duration to reflect the search range (hint:addinfo).
After the prequel, the binning starts. To make it short, my trick is to define a set of time slices (900s in my example) and for every time slice, determine if the event falls into that time slice or not. Set start_time to the transaction _time which is the time of the first event (login). If the transaction is closed, end_time=start_time+duration, else the transaction is still active, hence end_time=now().
Set a bucket_start_time which is the time the bucket starts that the start event lies in. Same with the end. mvrange creates the list of all buckets (_start_times) that the event covers. mvexpand will create one event per bucket. Set the event _time for every "bucketized" event and you are ready to timechart.

Enjoy

pgadhari
Builder

@ololdach - I dont want to use transaction command, as I dont know how long the user will be logged in so it will be difficult to find out the total duration, and also as the data increases, transaction results in slow query performance, is there any other alternative for this ? Please suggest ?

0 Karma

pgadhari
Builder

Thanks. I will check this query and revert.

0 Karma

manjunathmeti
Champion

hi @pgadhari,

Try this:

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats dc(eval(status="login")) as login_count, dc(eval(status="logout")) as logout_count by userid | eval active=login_count-logout_count | table userid, login_count, logout_count, active

Another simple way to do this is use latest function in stats command. Check if latest event contains status=login, if yes then it means it's user is active.

index=paloalto sourcetype="pan:log"  status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login"
0 Karma

pgadhari
Builder

the second query seems to be working fine as a single value viz. But, how can I timechart the latest login counts over the period of 1 hr or 2 hrs ? Please suggest ?

0 Karma

manjunathmeti
Champion

Use earliest, For example

To get count for last 15 mins:

index=paloalto sourcetype="pan:log" earliest=-15m  status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login" | stats count as users

To get count for last 1 hour:

index=paloalto sourcetype="pan:log"  earliest=-1h status=login OR status=logout | stats latest(status) as login_status by userid | where login_status="login" | stats count as users
0 Karma

pgadhari
Builder

@manjunathmeti - Now, I want to show the duration of the user also, along with logged out users. The requirement is to show how long the users have logged in the VPN the whole day, and whoever is active, I have to show the duration of the user as one of the field in my dashboard. I know that we have to use earliest(_time) and latest(_time) by user, status (login or logout), but somehow I m facing some issues in getting proper duration. Please help ?

0 Karma

pgadhari
Builder

I am asking for timechart instead of stats. ?

0 Karma

pgadhari
Builder

Sure. I will check this and revert.

0 Karma
Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...