Getting Data In

Inputlookup subsearch to match on field A and output field B in CSV file

Communicator

Im trying to correlate info based on a lookup file and no matter how I try, I cant make it work.

I have a CSV with values like:

host,country
host1,country1
host2
,country1
host3*,country2

etc

Im running a search that outputs hosts using a regex:
... | rex field=field1 //Simplified, extracts host from text
| [|inputlookup csvfile.csv ]
| bucket _time span=30d
| stats list(host), list(country) by _time

This search of course does not work.
I want to match the host field of the csv with the rexed host from the search and get the country the host is in in a list along with the actual host.

I've tried numerous ways but cannot get it to work. Some help would be greatly appreciated. How do I make sure to match the rexed host with a host* value in the csv?

0 Karma
1 Solution

Legend

Hi christoffertoft,
in your lookup don't use * but insert a row for each host

host,country
host1,country1
host2,country1
host3,country3
host4,country2
...

then try a search like this:

my_search
| rex "//Simplified"
| dedup host
| lookup csvfile.csv host OUTPUT country
| table host country

If you have more that 10,000 hosts use dedup 0 host.

Bye.
Giuseppe

View solution in original post

Legend

Hi christoffertoft,
in your lookup don't use * but insert a row for each host

host,country
host1,country1
host2,country1
host3,country3
host4,country2
...

then try a search like this:

my_search
| rex "//Simplified"
| dedup host
| lookup csvfile.csv host OUTPUT country
| table host country

If you have more that 10,000 hosts use dedup 0 host.

Bye.
Giuseppe

View solution in original post

Communicator

Hi Giuseppe,
I need to have host* entries because there are a too wide variety of possible names for me to catalogue. I only need to match the first part of the host anyway. Does this make a difference? Thanks for your reply

0 Karma

Communicator

Basically, I know that ABC123 and ABC456 belongs to a specific country and thus i only need to match on ABC.
If it was DEF123 i would now it was another country. As such, I can say ABC=country1, DEF=country2.

0 Karma

Legend

Hi christoffertoft,
so you can build your lookup

class_host,country
ABC,country1
DEF,country2
GHI,country3

and run this search

 my_search
 | rex "//Simplified"
 | eval class_host=substr(host,1,3)
 | lookup csvfile.csv class_host OUTPUT country
 | dedup host
 | table host country

Bye.
Giuseppe

0 Karma

Communicator

Right, I will try this! Thanks alot for the help

0 Karma

Legend

If you're satisfied, please accept or upvote this answer.
Bye.
Giuseppe

0 Karma

Legend

Hi christoffertoft,
lookup command matches only the full string, not *.
but if you can define a rule (e.g.: first 4 chars of hostname) you could build your lookup in this way (e.g. first 4 chars without *):

class_host,country
aaaa,country1
bbbb,country2
cccc,country3

and run something like this

my_search
| rex "//Simplified"
| eval class_host=substr(host,1,4)
| lookup csvfile.csv class_host OUTPUT country
| dedup host
| table host country

In this way lookup matches host and you can use the country field.

Bye.
Giuseppe

0 Karma

Let me see if I understand your problem. You have:
1. A csv file that maps host values to country values; and
2. Data containing values for host, which you are extracting with a rex command.

And your goal is to wind up with a table that maps host values present in #2 to their respective country values, as found from the csv file. If that's correct, it's pretty straightforward:

some search | rex extracting host value into field "host" | lookup csvfile.csv host OUTPUT country | stats values(host) AS host BY country

Communicator

Hi elliot,

youre correct, sorry for my bad wording in the OP. .

I tried this, and i get results but the country table is empty.

The hostnames in the csv have the first part ( a uniquely identifying part ) of the actual hostname presented by the search. i added a star to be able to match on the first part. This should be enough right?

0 Karma

Ahh, if you are looking to implement wildcard lookup, you'll need a bit more than that.
Check out this other answer for some guidance on how to do wildcard lookups:
https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html

0 Karma