I was reading https://answers.splunk.com/answers/560188/logic-behind-geographically-improbable-access-dete.html and I am trying to get a better and more simple query to find geographically improbable access.
My environment is not as full as these, I have atm 3 variables
subject=ID of the user
IP=IP address from where they have logged on
from IP I can obtain the fields "lat" and "lon"
Then with some simple string magic I am looking at the following
index=main eventtype="loginevents" subject=* | fields ip subject _time | iplocation ip | eval lat=tostring(lat), lon=tostring(lon) | eval latlon=lat.", ".lon | stats count by ip latlon
My issue is that this results just give me basic statistic data, what i want is to compare the 2 last logins and see how far those 2 locations are, so it would be adding the previous login's lat and lon in different fields, any idea to apply this?
You can use below query based on Haversine_formula
[BASE SEARCH] | dedup user_id, clientip | eval time1=_time | map maxsearches=99 search="search [BASE SEARCH] | eval clientip1=$clientip$, time1=$time1$, time2=_time | search user_id=$user_id$ clientip!=clientip1 | dedup user_id, clientip | rename clientip as clientip2" | where clientip1!=clientip2 | iplocation clientip1 | eval lat1=lat, lon1=lon, city1=City, country1=Country | iplocation clientip2 | eval lat2=lat, lon2=lon , city2=City, country2=Country | eval rlat1 = pi()*lat1/180, rlat2=pi()*lat2/180, rlat = pi()*(lat2-lat1)/180, rlon= pi()*(lon2-lon1)/180 | eval a = sin(rlat/2) * sin(rlat/2) + cos(rlat1) * cos(rlat2) * sin(rlon/2) * sin(rlon/2) | eval c = 2 * atan2(sqrt(a), sqrt(1-a)) | eval distance = 6371 * c | eval timestamp1=strftime(time1, "%y-%m-%d %H:%M:%S"), timestamp2=strftime(time2, "%y-%m-%d %H:%M:%S") | table user_id, timestamp1, clientip1, city1, country1, timestamp2,clientip2, city2, country2, distance | rename distance as "distance in KM"