Splunk Search

Understanding the LOOKUP command

mahbs
Path Finder

Hi,

I'm a bit confused with the lookup command, I.e the syntax.

lookup <lookup-table-name> <lookup-field1> AS <local-field1>, <lookup-field2> AS <local-field2> OUTPUTNEW <lookup-destfield1> AS <local-destfield1>, <lookup-destfield2> AS <local-destfield2>

Here's my understanding of it, and hopefully someone can fill in the gaps or correct me if I've said anythign wrong.

Lookups add fields from an external source to your events based on the values of fields that are already present.

Now, from the syntax, I understand everything leading up to the OUTPUTNEW bit. I dont understand whats happening post OUTPUTNEW. What exactly are we doing, and why are we looking up again?

1 Solution

elliotproebstel
Champion

Here's a pretty simplistic use case. You have a lookup table called "full_user_names.csv", and it contains three columns: username, first_name, and last_name. A sample row from this lookup table contains jsmith, jane, smith - so the username jsmith is mapped to a user whose full name is jane smith.

You perform this search: index=web_proxy and it returns events that contain username=jsmith. You can use the lookup to find the user's full name:

index=web_proxy | lookup full_user_names.csv username OUTPUTNEW first_name, last_name

After the lookup, the event will contain two new fields: first_name=jane and last_name=smith.

Now let's imagine you have that same lookup table, but your search returns events that contain local_user=jsmith (note the field name is now local_user, which doesn't match the field name username in your lookup. No problem, you use the AS clause to fix it:

index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name, last_name

Again, after the lookup, the event will contain two new fields: first_name=jane and last_name=smith.

To make matters even more complicated, now you have the same lookup table, and your search returns events that contain local_user=jsmith, and in order to correlate your events with some other logs, you want the user's first name to be returned into a field named f_name and last name to be returned into a field named l_name. Again, no problem - you solve it with the AS clause again:

index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name AS f_name, last_name AS l_name

Now, after the lookup, the event will contain two new fields: f_name=jane and l_name=smith.

View solution in original post

somesoni2
Revered Legend

Lets take an example to understand the command better.

Assume there is lookup table which give STO (server type owner) and department information for a host. Say lookup table name is host_info.csv with fields host, STO, department.

host, STO,department
host1,abc,dept1
host2,xyz,dept2

Now, assuming you've an index which has data with src_host and dest_host and you want to add (enrich) STO and department information from host_info.csv lookup for each of src_host and dest_host.

If you run this

index=foo sourcetype=bar | table srch_host dest_host 
| lookup host_info.csv host as src_host

OR

index=foo sourcetype=bar | table srch_host dest_host 
| lookup host_info.csv host as src_host OUTPUT STO department

you'll get output with following fields

src_host,dest_host,STO,department

Since there are two hosts in your data, to avoid confusion, you would want to rename the output fields from lookup to differentiate to which host field those are associated with, so you'll do like this

index=foo sourcetype=bar | table srch_host dest_host 
| lookup host_info.csv host as src_host OUTPUT STO AS src_STO department AS src_department

The output will now have fields:

src_host,dest_host,src_STO,src_department

Extending it to get information for dest host as well.

index=foo sourcetype=bar | table srch_host dest_host 
| lookup host_info.csv host as src_host OUTPUT STO AS src_STO department AS src_department
| lookup host_info.csv host as dest_host OUTPUT STO AS dest_STO department AS dest_department

The output will now have fields:

src_host,dest_host,src_STO,src_department,dest_STO,dest_department

Hope this helps.

elliotproebstel
Champion

Here's a pretty simplistic use case. You have a lookup table called "full_user_names.csv", and it contains three columns: username, first_name, and last_name. A sample row from this lookup table contains jsmith, jane, smith - so the username jsmith is mapped to a user whose full name is jane smith.

You perform this search: index=web_proxy and it returns events that contain username=jsmith. You can use the lookup to find the user's full name:

index=web_proxy | lookup full_user_names.csv username OUTPUTNEW first_name, last_name

After the lookup, the event will contain two new fields: first_name=jane and last_name=smith.

Now let's imagine you have that same lookup table, but your search returns events that contain local_user=jsmith (note the field name is now local_user, which doesn't match the field name username in your lookup. No problem, you use the AS clause to fix it:

index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name, last_name

Again, after the lookup, the event will contain two new fields: first_name=jane and last_name=smith.

To make matters even more complicated, now you have the same lookup table, and your search returns events that contain local_user=jsmith, and in order to correlate your events with some other logs, you want the user's first name to be returned into a field named f_name and last name to be returned into a field named l_name. Again, no problem - you solve it with the AS clause again:

index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name AS f_name, last_name AS l_name

Now, after the lookup, the event will contain two new fields: f_name=jane and l_name=smith.

nick405060
Motivator

Does not work whatsoever. Like everything in Splunk.

0 Karma

danbutterman
Explorer

This is by far the most easy-to-digest example of using and understanding the lookup command that I've found. Thank you for taking the time to lay this out. I owe you a beer or a coffee, sir.

0 Karma

elliotproebstel
Champion

Thank you! I'm glad it's useful!

0 Karma

cmerriman
Super Champion

you can use OUTPUT or OUTPUTNEW or neither. if neither are specified, all fields are outputted (aside from the lookup fields, since they are being matched upon)

If OUTPUT is specified, only those fields are outputted and will overwrite existing fields.

if OUTPUTNEW is specified, i believe it basically only brings in the lookup for events which don't already have the output fields.

You aren't looking anything up again, you're basically saying lookup this field in my table to match against my events and output a copy of that field into the events

also, if you ever have questions or comments on the documentation, you can (and are encouraged) to submit feedback at the bottom of the page. the docs team is very responsive and helpful.

amzar96
Explorer

hi, do you know any data limitation if we are using "OUTPUT" in the lookup? 

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...