Some sample data for creating a maps visualisation in splunk
countries_lat_long_int_code.csv
code,name,country,latitude,longitude
61,Australia,AU,-25.274398,133.775136
86,China,CN,35.86166,104.195397
49,Germany,DE,51.165691,10.451526
33,France,FR,46.227638,2.213749
64,New Zealand,NZ,-40.900557,174.885971
685,Samoa,WS,-13.759029,-172.104629
41,Switzerland,CH,46.818188,8.227512
1,United States,US,37.09024,-95.712891
678,Vanuatu,VU,-15.376706,166.959158
If I add this to Lookups » Lookup table files
in Splunk, I can generate a map visualisation.
Then if I put something like this in the search bar, it will generate a map visualization
| inputlookup countries_lat_long_int_code.csv | fields + latitude longitude | eval field1=100
the stats tab will look like this:
latitude longitude field1
-25.274398 133.775136 100
35.86166 104.195397 100
51.165691 10.451526 100
46.227638 2.213749 100
-40.900557 174.885971 100
-13.759029 -172.104629 100
46.818188 8.227512 100
37.09024 -95.712891 100
-15.376706 166.959158 100
What I would like to know is what parameters/format the data has to be in for a maps visualisation?
For example, it looks like latitude and longitude must be the first 2 columns, and possibly in that particular order.
Can anyone explain what other formats are accepted, or point me in the right direction? For example I am just playing around with something like this:
| inputlookup countries_lat_long_int_code.csv | fields + latitude longitude | eval field1=100 | eval field2=200 | eval field3="country name"
The data has to have the format you already have, i.e. degrees latitude and longitude - that's it. Where they come from and what else you do with them is entirely up to you.
Have you had a look at the geostats command? It needs at least one statistics function to calculate numbers for the "geo-bins" (you can have a simple count of events per binned location, or an average of a field y, or anything you can think of). Of course it also needs latitude and longitude information. If these fields exist with the names lat
and lon
, then you won't have to explicitly specify them, otherwise you specifically tell the command where to look for those two values with latfield =
and longfield =
.
To use your inputlookup, you can do something like this:
| inputlookup countries_lat_long_int_code.csv | geostats latfield=latitude longfield=longitude count
If you named the colums in your csv "lat" and "lon", the search could simply be
| inputlookup countries_lat_long_int_code.csv | geostats count
I hope this answers your question. You don't need any order in your data, it's all in the fields. And heck, if it isn't, you can simply eval it on the fly 🙂
tks, food for thought for me re the geostats. will look more into that.
I tried this | inputlookup countries_lat_long_int_code.csv | geostats latfield=latitude longfield=longitude count
and this creates a geo_bin
column but not sure what this is, some other coordinates type method?
Yes, geostats
will group your events into buckets/bins (based on distance to each other on the map in relation to the current zoom level and the settings on the maximum number of bins), much like bucket _time
does based on time. It will do this for each zoom level and name the buckets with their x and y coordinates, which is why you see data like e.g. "zl_0" for zoom level 0 and "y_144_x_190" for the bucket containing all events from that area on that zoom level.
The statistics view of geostats
is not that impressive though, head to the visualization tab to see the magic 🙂
tks, just to clarify, does it do some kink of clustering i.e. looking at the uk and ireland you would only see one big marker and then as you zoom in this one marker would be broken up to show 2 markers 1 for ireland and one for UK. Do I understand that correctly?
Partially, yes. It creates these buckets based on distance, not based on country.