Dashboards & Visualizations

Choropleth - getting the data correct

chadwell
Explorer

In my web application, I want to log user country when they first log in. I then want to create a Choropleth using the geom country built in lookup.

I understand the examples in how to use the geom tool, but I am not clear on what I need to log into splunk, what to search for and how to get that data correctly into geom.

So for example, if I decide to log this every time a user logs in:

Log.Info ("User logged in from:" + user.country);

e.g.s "User logged in from: Canada"
"User logged in from: France"

I then want to map the count of each country and map it on a Choropleth. Could someone help me out here?

Thanks

Labels (3)
0 Karma
1 Solution

rmmiller
Contributor

You have 2 options, but your approach is significantly more work unless you don't have the real client IP address in your logs (perhaps due to a reverse proxy). I'll describe both.

To do this with your approach:

Before doing attempting to do this with Splunk, make sure the country names you write in your log match up to the names here:

| inputlookup geo_countries
| table featureId

This gives you the names of the countries as Splunk knows them. Getting this part right is critical to your statistics being correct. If you use a country name that doesn't match up to the name Splunk knows, that country's count won't show up in your choropleth map.

Once you are logging the country name correctly, you'll need to extract your logged country name into a field. In this example I used the field name featureId since it helps illustrate an argument to the geom command.

| rex field=_raw "User\slogged\sin\sfrom\:\s(?<featureId>[^\"]+)"

Next, use stats to calculate your counts per country (featureId):

| stats count BY featureId

This will tally up the events in your log per country name, which is stored in the featureId field.

Last, call geom to add the JSON data to make your choropleth map come to life:

| geom geo_countries

In my example, I put the country name in a field named featureId since that's the field name in the lookup file that geom reads. If you stored your logged country name in a different field, you'll need to use the featureIdField argument to specify that name like |geom geo_countries featureIdField=myCountryFieldName and it will work just fine.

Put it all together and it looks like this:

<your search to get your login info>
| rex field=_raw "User\slogged\sin\sfrom\:\s(?<featureId>[^\"]+)"
| stats count BY featureId
| geom geo_countries

Change your visualization to choropleth map, and your work is done.

Again, that's the hard way unless your web logs don't have the real client IP address. However, if you are already logging the real client IP address, then let Splunk do all the work for you. Don't bother creating a custom log containing the login country and use the logged IP address instead. In this example/approach, I'm assuming you can get your client IP address into a field named clientipFieldName. Obviously, change if it's in a different field name.

The easy way:

<your search to get your logged client IP in field named clientipFieldName>
| iplocation clientipFieldName
| stats count BY Country
| geom geo_countries featureIdField=Country

And finally, here's some run-anywhere code to play around with to help understand how to get choropleth data into the correct format:

| makeresults count=500
| eval countries=split("France,France,Canada,Canada,Canada,United States,Honduras,Turkey,Ukraine",",")
| eval Country=mvindex(countries,random()%9)
| rename COMMENT AS "The 3 lines above are just simulating your data"
| stats count BY Country
| geom geo_countries featureIdField=Country

Hope that helps!
rmmiller

View solution in original post

DalJeanis
Legend

You have three completely different things in your question.

First, how do you get the information from your web application to Splunk.
Second, how do you get the information from Splunk of how many people logged on from where.
Third, how do you present that on a Choropleth.

What is it that you really want to know?

0 Karma

rmmiller
Contributor

You have 2 options, but your approach is significantly more work unless you don't have the real client IP address in your logs (perhaps due to a reverse proxy). I'll describe both.

To do this with your approach:

Before doing attempting to do this with Splunk, make sure the country names you write in your log match up to the names here:

| inputlookup geo_countries
| table featureId

This gives you the names of the countries as Splunk knows them. Getting this part right is critical to your statistics being correct. If you use a country name that doesn't match up to the name Splunk knows, that country's count won't show up in your choropleth map.

Once you are logging the country name correctly, you'll need to extract your logged country name into a field. In this example I used the field name featureId since it helps illustrate an argument to the geom command.

| rex field=_raw "User\slogged\sin\sfrom\:\s(?<featureId>[^\"]+)"

Next, use stats to calculate your counts per country (featureId):

| stats count BY featureId

This will tally up the events in your log per country name, which is stored in the featureId field.

Last, call geom to add the JSON data to make your choropleth map come to life:

| geom geo_countries

In my example, I put the country name in a field named featureId since that's the field name in the lookup file that geom reads. If you stored your logged country name in a different field, you'll need to use the featureIdField argument to specify that name like |geom geo_countries featureIdField=myCountryFieldName and it will work just fine.

Put it all together and it looks like this:

<your search to get your login info>
| rex field=_raw "User\slogged\sin\sfrom\:\s(?<featureId>[^\"]+)"
| stats count BY featureId
| geom geo_countries

Change your visualization to choropleth map, and your work is done.

Again, that's the hard way unless your web logs don't have the real client IP address. However, if you are already logging the real client IP address, then let Splunk do all the work for you. Don't bother creating a custom log containing the login country and use the logged IP address instead. In this example/approach, I'm assuming you can get your client IP address into a field named clientipFieldName. Obviously, change if it's in a different field name.

The easy way:

<your search to get your logged client IP in field named clientipFieldName>
| iplocation clientipFieldName
| stats count BY Country
| geom geo_countries featureIdField=Country

And finally, here's some run-anywhere code to play around with to help understand how to get choropleth data into the correct format:

| makeresults count=500
| eval countries=split("France,France,Canada,Canada,Canada,United States,Honduras,Turkey,Ukraine",",")
| eval Country=mvindex(countries,random()%9)
| rename COMMENT AS "The 3 lines above are just simulating your data"
| stats count BY Country
| geom geo_countries featureIdField=Country

Hope that helps!
rmmiller

chadwell
Explorer

We don't have real ip addresses as the users are behind corporate VPNs so they all look like they are coming from the same place.

Thanks for this, it looks exactly like what I need. I will try it out and report back.

0 Karma

rmmiller
Contributor

@chadwell Any luck?

0 Karma

chadwell
Explorer

worked great!

0 Karma

rmmiller
Contributor

Sweet! Good luck on your Splunk travels!

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...