Splunk Search

Extracting countries from sourcetype without Longitude/Latitude ?

wifemin
Engager

Hi ! I am new to splunk, and just started recently.
I have some RSS feeds implemented into Splunk through "Syndication", and I was wondering on how I can extract countries from the feeds as there is no longitude/latitude ?

Edit //
There is no IP addresses either. sourcetype=syndication, I guess there is a need for this in order to show and extract the country out from the feeds in syndication

Here's an example of a raw feed.

summary="

Russia is engaged in wide-ranging information warfare operations aimed at undermining the United States, and the federal government has few defenses against the attacks,

"

In this case, I would like to extract the country "Russia" and add a count to it and show it on a map.

Sorry for being vague, I would give more information if needed, because I don't know where I am being vague at

0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

There are several steps to this.

The first thing you need to do is make a rex or regex to pull out country names from the events in your data. Well, at least that's what I did. The "string" of all the country names I made the rex/regex out of I created by running this search:

| inputlookup geo_attr_countries.csv 
| fields country  

Then moving the output to notepad++ and putting a pipe symbol in place of each \r\n. That gave me a list like this:

Aruba|Afghanistan|Angola|Anguilla|Albania|American Samoa|Antarctica| ... 

Now that I have that list, I can do something silly like create a big rex and feed into it some test data. (I'm just going to test with a subset of the data for now). I heartily recommend making this a) an automatic lookup but b) only on the data you need. Anyway, the run-anywhere example is below:

| makeresults 
| eval MyTest="And today, Ireland declared they no longer believed in trees." 
|  rex field=MyTest "(?<CountryName>(Ireland|Iran|Iraq|Iceland|Portugal|Paraguay|Palestine|Romania|Russia))"

This gives me results for CountryName just like I'd expect. If you change 'Ireland' to 'Paraguay' or 'Russia' it returns that. (And yes, I actually trimmed out some of the inside country names from that example string because I wanted both Ireland and Russia in it, but for it to not be too long. Just because. 🙂 )

CountryName         MyTest      _time   
Ireland     And today, Ireland declared they no longer believed in trees.   2017-07-23 07:47:09 

Now that we have that, we need to use it in a lookup. The lookup is a geom lookup, it's special. We use our new CountryName as the featureIdField so that it knows how to look it up.

| makeresults 
| eval MyTest="And today, Ireland declared they no longer believed in trees." 
| rex field=MyTest "(?<CountryName>(Ireland|Iran|Iraq|Iceland|Portugal|Paraguay|Palestine|Romania|Russia))"
| geom geo_countries featureIdField=CountryName

With that for my example (it's a run-anywhere, so give it a try) you can switch to visualization and change to the Chloropleth map and it should work.

I'm editing my own answer to add that last piece, so I can no longer see if there's any significant adjustment needed for your example. If there is I'll add a comment to this. If not, let us know how this worked!

View solution in original post

Richfez
SplunkTrust
SplunkTrust

There are several steps to this.

The first thing you need to do is make a rex or regex to pull out country names from the events in your data. Well, at least that's what I did. The "string" of all the country names I made the rex/regex out of I created by running this search:

| inputlookup geo_attr_countries.csv 
| fields country  

Then moving the output to notepad++ and putting a pipe symbol in place of each \r\n. That gave me a list like this:

Aruba|Afghanistan|Angola|Anguilla|Albania|American Samoa|Antarctica| ... 

Now that I have that list, I can do something silly like create a big rex and feed into it some test data. (I'm just going to test with a subset of the data for now). I heartily recommend making this a) an automatic lookup but b) only on the data you need. Anyway, the run-anywhere example is below:

| makeresults 
| eval MyTest="And today, Ireland declared they no longer believed in trees." 
|  rex field=MyTest "(?<CountryName>(Ireland|Iran|Iraq|Iceland|Portugal|Paraguay|Palestine|Romania|Russia))"

This gives me results for CountryName just like I'd expect. If you change 'Ireland' to 'Paraguay' or 'Russia' it returns that. (And yes, I actually trimmed out some of the inside country names from that example string because I wanted both Ireland and Russia in it, but for it to not be too long. Just because. 🙂 )

CountryName         MyTest      _time   
Ireland     And today, Ireland declared they no longer believed in trees.   2017-07-23 07:47:09 

Now that we have that, we need to use it in a lookup. The lookup is a geom lookup, it's special. We use our new CountryName as the featureIdField so that it knows how to look it up.

| makeresults 
| eval MyTest="And today, Ireland declared they no longer believed in trees." 
| rex field=MyTest "(?<CountryName>(Ireland|Iran|Iraq|Iceland|Portugal|Paraguay|Palestine|Romania|Russia))"
| geom geo_countries featureIdField=CountryName

With that for my example (it's a run-anywhere, so give it a try) you can switch to visualization and change to the Chloropleth map and it should work.

I'm editing my own answer to add that last piece, so I can no longer see if there's any significant adjustment needed for your example. If there is I'll add a comment to this. If not, let us know how this worked!

wifemin
Engager

Should I have a props.conf or do I just go Settings > Lookups > Automatic Lookup and how should I make it into a rex ?
Sorry, I don't really know how to splunk very well yet )':

0 Karma

Richfez
SplunkTrust
SplunkTrust

So, I gave an example of the inline rex you could use (well, follow the very first pieces of the answer to include all the countries).

To make that automatic, you would include it in a local/props.conf somewhere. Where - that's still undetermined because I don't know how you built what you have so far. Here's a couple of choices.

It is not likely from the sound of it, but if you are working inside an app you created, you'll use $SPLUNKHOME/etc/apps/myappname/local/props.conf.

If you are not working inside an app you created, you are probably in search. In this case it'll be fine in $SPLUNKHOME/etc/apps/search/local/props.conf.

As a generic place - which can also work, put it in $SPLUNKHOME/etc/system/local/props.conf.

Just be sure not to put it in any default/props.conf. It doesn't go there and will get overwritten at next upgrade. In both of the latter cases, to someday future proof this I'd look at the section of the docs on creating an app and create a new app to put these configs in. Still, that's not critical - anywhere that works would be fine in this case.

Now, the config.

If you edit props.conf, you'll want a few lines like...

[my_sourcetype]
EXTRACT-country_names = <myextraction>

Like, in this case,

[syndication]
EXTRACT-country_names = (?<CountryName>(Ireland|Iran|Iraq|Iceland|Portugal|Paraguay|Palestine|Romania|Russia))

You'll have to restart bits and pieces of Splunk to get those to take effect, easiest is to just restart Splunk itself.

After that, if you search your sourcetype=syndication sourcetype in verbose mode you should see a field "CountryName" show up, which you can then use in the way mentioned above.

A couple of additional notes: niketnilay also has some great answers up in the comments above. You may be able to use those as well. Specifically, his last comment might work very well, so give that a try too. Goes about it a different way.

Regardless of which of these answers is the better for your needs, be sure to upvote the other ones that you found useful. (Upvoting is to click the little up arrow that shows up when you hover your mouse over the name of the useful comments/answers.)

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Some sample data would be helpful.
You may be able to get country information using IP addresses.

---
If this reply helps you, Karma would be appreciated.
0 Karma

wifemin
Engager

I added an example, and there is no IP addresses available 😞

0 Karma

niketn
Legend

@widermin, you need a CSV based lookup for all countries with their Geo Location.

https://developers.google.com/public-data/docs/canonical/countries_csv

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

wifemin
Engager

Can I use simplemaps-worldcities-basic.csv /geo_attr_countries.csv for lookup?

How should the search look like?
sourcetype="syndication" [|inputlookup .csv | fields country] ?

0 Karma

niketn
Legend

You can use geo_attr_countries.csv if you want to plot Choropleth Map using Country Region (not latitude and longitude as posted in your question). I am adding an example to search using latitude/longitude csv file, you can modify the same if you prefer Choropleth instead.

Assuming country_geolocation.csv file has fields name country_abbr latitude and longitude fields and lookup definition country_geolocation has been created.

sourcetype="syndication" [|inputlookup country_geolocation.csv | fields country|  rename country as search  | format ]
| eval mvRawData=split(_raw," ")
| lookup  country_geolocation name as mvRawData output name country_abbr latitude longitude
| geostats count by name latfield=latitude longfield=longitude

PS: I have used _raw for extracting Country Name, however, you should be having some field name where text containing Country name is stored. Kindly use your <fieldname> instead of _raw.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...