Splunk Search

How to find "non" unique users.

jjjefferson
Engager

<--- NOOB

Ok...so here is my quandry... I have a query (see below) that returns a list of users, ips and client info from the geoip app. I want to take these results and only look at user with more than one ip associated with them. I used dedup to get rid of duplicates so this is a pretty boiled down set of results. However, there are still thousands of them. How do I get rid of "non-unique" users keeping only the users that appear more than once.

query:

index=<some_index>  c_ip=* User_Agent!="*<string>*" cs_username!="-" rs_Content_Type = "text/html*" | `exclude_internal_ip` | dedup username ip | lookup geoip clientip as ip | fillnull value=NULL | table username,ip,client_country,client_region,date,time,client_city,client_lat,client_lon | sort by  str(username),num(ip),date,time.str(client_region)

output is:

user1     xxx.xx.xx.xx  US  etc...
user2     xxx.xx.xx.yy  US  etc...
user2     xxx.xx.xx.xy  US  etc...
user3     xxx.xx.xx.xz  US  etc...
user4     xxx.xx.xx.zz  US  etc...
user4     xxx.xx.xx.zx  US  etc...

I want to pair down the list to:

user2     xxx.xx.xx.yy  US  etc...
user2     xxx.xx.xx.xy  US  etc...
user4     xxx.xx.xx.zz  US  etc...
user4     xxx.xx.xx.zx  US  etc...
Tags (1)
1 Solution

Ayn
Legend

Use eventstats to find users with multiple logins.

index=<some_index>  c_ip=* User_Agent!="*<string>*" cs_username!="-" rs_Content_Type = "text/html*" | `exclude_internal_ip` | eventstats dc(ip) as distinct_ips by username| search distinct_ips>1 | lookup geoip clientip as ip | fillnull value=NULL | table username,ip,client_country,client_region,date,time,client_city,client_lat,client_lon | sort by  str(username),num(ip),date,time.str(client_region)

View solution in original post

jjjefferson
Engager

awesome! I just moved the eventstats to the end :

index=some_log c_ip=* cs_User_Agent!="" cs_username!="-" rs_Content_Type = "text/html*" | exclude_internal_ip | dedup cs_username ip | lookup geoip clientip as ip | fillnull value=NULL | table cs_username,ip,client_country,client_region,date,time,client_city,client_lat,client_lon | sort by str(cs_username),num(ip),date,time.str(client_region) | eventstats dc(ip) as distinct_ips by cs_username| search distinct_ips>1 |

this lists only the users with multiple ips!

Thank You!!!!!

0 Karma

MHibbin
Influencer

I would use eventstats in the search string.

With eventstats you can the dc(x) function which will provide a distinct (unique) count of x (which could be your IP addresses). I have not tested this, so may need a little tweaking, but perhaps this:

index=<some_index>  c_ip= User_Agent!="<string>" cs_username!="-" rs_Content_Type = "text/html" 
| exclude_internal_ip 
| lookup geoip clientip as ip 
| fillnull value=NULL 
| eventstats dc(ip) AS dcIP by username
| dedup username, ip
| where dcSrc>1
| table username,ip,client_country,client_region,date,time,client_city,client_lat,client_lon
| sort by str(username),num(ip),date,time.str(client_region)

Hope this helps.

Ref:
http://docs.splunk.com/Documentation/Splunk/4.2.3/SearchReference/Eventstats
http://docs.splunk.com/Documentation/Splunk/5.0.2/SearchReference/CommonStatsFunctions
http://docs.splunk.com/Documentation/Splunk/5.0.2/SearchReference/Where

Ayn
Legend

Use eventstats to find users with multiple logins.

index=<some_index>  c_ip=* User_Agent!="*<string>*" cs_username!="-" rs_Content_Type = "text/html*" | `exclude_internal_ip` | eventstats dc(ip) as distinct_ips by username| search distinct_ips>1 | lookup geoip clientip as ip | fillnull value=NULL | table username,ip,client_country,client_region,date,time,client_city,client_lat,client_lon | sort by  str(username),num(ip),date,time.str(client_region)

MHibbin
Influencer

Ah beat me to it!

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...