Reporting

Find the Distance Between Two or More Geolocation Coordinates

martinaire
Explorer

I am trying to find the distance between two or more IP geolocations without the use of an external script (not an admin). Here is my base search:

tag=login | geoip src_ip | stats distinct_count(src_ip_country_name) AS count_country, values(src_ip_country_name) AS country by username | where count_country > 1

I know I can find the difference in the latitude and longitude fields. Something like the following:

sqrt(pow(src_ip_latidude1-src_ip_latidude2,2)+pow(src_ip_longitude1-src_ip_logitude2,2))

But how do I incorporate that into my base search? Would I be able to build a table with the geolocations and the distance grouped by username?

Thanks!

MuS
Legend

Hi there,

fast forward into the future, we can do the great circle formula in Splunk now.
This example will provide the expected result:

| makeresults 
| eval lat1=1, lon1=1, lat2=2, lon2=2 
| eval rlat1 = pi()*lat1/180, rlat2=pi()*lat2/180, rlat = pi()*(lat2-lat1)/180, rlon= pi()*(lon2-lon1)/180
| eval a = sin(rlat/2) * sin(rlat/2) + cos(rlat1) * cos(rlat2) * sin(rlon/2) * sin(rlon/2) 
| eval c = 2 * atan2(sqrt(a), sqrt(1-a)) 
| eval distance = 6371 * c
| table lat1 lon1 lat2 lon2 distance

distance will be the distance in km.

Hope this helps ...

cheers, MuS

malvidin
Communicator

The three macros below calculate the haversine formula that @MuS provided. 

[haversine(5)]
# Calculate the great circle distance for a sphere with an arbitrary radius
args = input_lat1, input_lon1, input_lat2, input_lon2, hav_radius
definition = "eval hav_lat1_radians = pi()*$input_lat1$/180, hav_lat2_radians=pi()*$input_lat2$/180, hav_delta_lat_radians = pi()* ($input_lat2$-$input_lat1$)/180, hav_delta_lon_radians= pi()*($input_lon2$-$input_lon1$)/180 | eval hav_intermediate = pow(sin(hav_delta_lat_radians/2), 2) + cos(hav_lat1_radians) * cos(hav_lat2_radians) * pow(sin(hav_delta_lon_radians/2), 2) | eval hav_distance = 2 * $hav_radius$ * atan2(sqrt(hav_intermediate), sqrt(1-hav_intermediate)) | fields - hav_*_radians, hav_intermediate "

[haversine(4)]
# Calculate the great circle distance for the earth (in kilometers)
args = input_lat1, input_lon1, input_lat2, input_lon2
definition = "`haversine($input_lat1$, $input_lon1$, $input_lat2$, $input_lon2$, 6371)` "

[haversine(2)]
# Calculate the great circle distance between two IPs (in kilometers)
args = input_ip1, input_ip2
definition = "iplocation $input_ip1$ prefix=$input_ip1$_ | iplocation $input_ip2$ prefix=$input_ip2$_ | `haversine($input_ip1$_lat, $input_ip1$_lon, $input_ip2$_lat, $input_ip2$_lon)` "

 Using streamstats, you can calculate IP location distances between events. With eventstats, you can calculate IP location distances between a common IP location and an events IP location.

 

The calculated value is returned as hav_distance, to decrease the chances of a field name collision.

The haversine formula is not as accurate as Vincenty's formulae, but is much more accurate than a simple chord length calculation. 

| makeresults 
| eval usual_src_ip="8.8.8.8", src_ip="9.9.9.9"
| `haversine(usual_src_ip, src_ip)`
| where hav_distance > 500 

 

_smp_
Builder
Just wanted to post a quick thanks for these macros. They came in handy to replace a custom command |distance that was included with the Okta app which did not pass the Cloud app vetting process. Thanks for posting!
0 Karma

Damien_Dallimor
Ultra Champion

There is a Haversine add-on on Splunkbase that should do the trick for you.

aworkman
Engager
0 Karma

rgonzale6
Path Finder

I'm working on a similar query and I much appreciate what you've both done here. I've worked up this:

| lookup geoip clientip |dedup userID, client_city| eval location=clientip."- ".client_city.", ".client_region.", ".client_country| stats last(client_lat) as Lat1, last(client_lon) as Lon1, first(client_lat) as Lat2, first(client_lon) as Lon2, values(location) dc(client_city) as distinctCount by userID| where distinctCount = 2 | eval distance=sqrt(pow(Lat1-Lat2,2)+pow(Lon1-Lon2,2))|sort distance desc

I've gotten it to work when a user has had 2 different IPs. using first & last precludes more though. Still trying to work on that.

0 Karma

sideview
SplunkTrust
SplunkTrust

The pythagorean theorem is a good approximation only for shorter distances. If you're actually dealing with pretty big distances you have to break out some trig functions and calculate great circle distance. http://en.wikipedia.org/wiki/Great-circle_distance

And since eval can't do trig functions ( see http://splunk-base.splunk.com/answers/26399/can-eval-evaluate-cosines ) that would lead you back to a custom search command again.

However, if your distances are all short enough, then what you propose just needs to be plugged into eval.

| eval distance=sqrt(pow(src_ip_latidude1-src_ip_latidude2,2)+pow(src_ip_longitude1-src_ip_logitude2,2))

Once that eval clause gives you that field called distance on your rows, you can do whatever you want with it.

sideview
SplunkTrust
SplunkTrust

No, I don't see why you'd need to do the distance calculation within the stats clause. That would be a little crazy. Do it before and use some form of last(distance) as distance by username, or by username distance in your stats, and then filter afterwards. Or use some form of last(src_ip_latitude) as src_ip_latitude last(src_ip_longitude) as src_ip_longitude in stats and then do the distance calculation after.

martinaire
Explorer

I think my question is a little more complex than I initially thought. My current base search only has the src_ip_latitude and src_ip_longitude fields. I want break it up (e.g. latitude1, latitude2, etc.) grouped by the username. I'm thinking I would need alter the end of my search to something like "where (count_country > 1) AND (distance > 100)". That means I likely need to do the distance calculation it within my stats clause. Because after my stats clause, I no longer have access to the latitude and longitude fields.

0 Karma

sideview
SplunkTrust
SplunkTrust

Assuming you have those other four fields in your events, just tack the | eval onto the end of the search. Just by that eval will add an additional field to all rows called "distance". Again you have to have all four of those fields by those exact case sensitive names, on all events. More generally on all incoming rows, whether they're events or whether they've already been transformed or altered by other search language commands.

0 Karma

martinaire
Explorer

I completely forgot about the fact that that the Earth is round. 🙂 Too bad I can't use the great-circle formula.

How can I pull out the latitude and longitude field by username and plug it into the eval? In other words, how can I incorporate the eval into the base search?

Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...