I am trying to find the distance between two or more IP geolocations without the use of an external script (not an admin). Here is my base search:
tag=login | geoip src_ip | stats distinct_count(src_ip_country_name) AS count_country, values(src_ip_country_name) AS country by username | where count_country > 1
I know I can find the difference in the latitude and longitude fields. Something like the following:
But how do I incorporate that into my base search? Would I be able to build a table with the geolocations and the distance grouped by username?
The pythagorean theorem is a good approximation only for shorter distances. If you're actually dealing with pretty big distances you have to break out some trig functions and calculate great circle distance. http://en.wikipedia.org/wiki/Great-circle_distance
And since eval can't do trig functions ( see http://splunk-base.splunk.com/answers/26399/can-eval-evaluate-cosines ) that would lead you back to a custom search command again.
However, if your distances are all short enough, then what you propose just needs to be plugged into eval.
| eval distance=sqrt(pow(src_ip_latidude1-src_ip_latidude2,2)+pow(src_ip_longitude1-src_ip_logitude2,2))
Once that eval clause gives you that field called distance on your rows, you can do whatever you want with it.
I completely forgot about the fact that that the Earth is round. 🙂 Too bad I can't use the great-circle formula.
How can I pull out the latitude and longitude field by username and plug it into the eval? In other words, how can I incorporate the eval into the base search?
Assuming you have those other four fields in your events, just tack the
| eval onto the end of the search. Just by that eval will add an additional field to all rows called "distance". Again you have to have all four of those fields by those exact case sensitive names, on all events. More generally on all incoming rows, whether they're events or whether they've already been transformed or altered by other search language commands.
I think my question is a little more complex than I initially thought. My current base search only has the srciplatitude and srciplongitude fields. I want break it up (e.g. latitude1, latitude2, etc.) grouped by the username. I'm thinking I would need alter the end of my search to something like "where (count_country > 1) AND (distance > 100)". That means I likely need to do the distance calculation it within my stats clause. Because after my stats clause, I no longer have access to the latitude and longitude fields.
No, I don't see why you'd need to do the distance calculation within the stats clause. That would be a little crazy. Do it before and use some form of
last(distance) as distance by username, or
by username distance in your stats, and then filter afterwards. Or use some form of
last(src_ip_latitude) as src_ip_latitude last(src_ip_longitude) as src_ip_longitude in stats and then do the distance calculation after.
I'm working on a similar query and I much appreciate what you've both done here. I've worked up this:
| lookup geoip clientip |dedup userID, client_city| eval location=clientip."- ".client_city.", ".client_region.", ".client_country| stats last(client_lat) as Lat1, last(client_lon) as Lon1, first(client_lat) as Lat2, first(client_lon) as Lon2, values(location) dc(client_city) as distinctCount by userID| where distinctCount = 2 | eval distance=sqrt(pow(Lat1-Lat2,2)+pow(Lon1-Lon2,2))|sort distance desc
I've gotten it to work when a user has had 2 different IPs. using first & last precludes more though. Still trying to work on that.
fast forward into the future, we can do the great circle formula in Splunk now.
This example will provide the expected result:
| makeresults | eval lat1=1, lon1=1, lat2=2, lon2=2 | eval rlat1 = pi()*lat1/180, rlat2=pi()*lat2/180, rlat = pi()*(lat2-lat1)/180, rlon= pi()*(lon2-lon1)/180 | eval a = sin(rlat/2) * sin(rlat/2) + cos(rlat1) * cos(rlat2) * sin(rlon/2) * sin(rlon/2) | eval c = 2 * atan2(sqrt(a), sqrt(1-a)) | eval distance = 6371 * c | table lat1 lon1 lat2 lon2 distance
distance will be the distance in
Hope this helps ...