Hi,
I'm a bit confused with the lookup command, I.e the syntax.
lookup <lookup-table-name> <lookup-field1> AS <local-field1>, <lookup-field2> AS <local-field2> OUTPUTNEW <lookup-destfield1> AS <local-destfield1>, <lookup-destfield2> AS <local-destfield2>
Here's my understanding of it, and hopefully someone can fill in the gaps or correct me if I've said anythign wrong.
Lookups add fields from an external source to your events based on the values of fields that are already present.
Now, from the syntax, I understand everything leading up to the OUTPUTNEW bit. I dont understand whats happening post OUTPUTNEW. What exactly are we doing, and why are we looking up again?
Here's a pretty simplistic use case. You have a lookup table called "full_user_names.csv", and it contains three columns: username
, first_name
, and last_name
. A sample row from this lookup table contains jsmith, jane, smith
- so the username jsmith
is mapped to a user whose full name is jane smith
.
You perform this search: index=web_proxy
and it returns events that contain username=jsmith
. You can use the lookup to find the user's full name:
index=web_proxy | lookup full_user_names.csv username OUTPUTNEW first_name, last_name
After the lookup, the event will contain two new fields: first_name=jane
and last_name=smith
.
Now let's imagine you have that same lookup table, but your search returns events that contain local_user=jsmith
(note the field name is now local_user
, which doesn't match the field name username
in your lookup. No problem, you use the AS
clause to fix it:
index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name, last_name
Again, after the lookup, the event will contain two new fields: first_name=jane
and last_name=smith
.
To make matters even more complicated, now you have the same lookup table, and your search returns events that contain local_user=jsmith
, and in order to correlate your events with some other logs, you want the user's first name to be returned into a field named f_name
and last name to be returned into a field named l_name
. Again, no problem - you solve it with the AS
clause again:
index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name AS f_name, last_name AS l_name
Now, after the lookup, the event will contain two new fields: f_name=jane
and l_name=smith
.
Lets take an example to understand the command better.
Assume there is lookup table which give STO (server type owner) and department information for a host. Say lookup table name is host_info.csv with fields host, STO, department.
host, STO,department
host1,abc,dept1
host2,xyz,dept2
Now, assuming you've an index which has data with src_host and dest_host and you want to add (enrich) STO and department information from host_info.csv lookup for each of src_host and dest_host.
If you run this
index=foo sourcetype=bar | table srch_host dest_host
| lookup host_info.csv host as src_host
OR
index=foo sourcetype=bar | table srch_host dest_host
| lookup host_info.csv host as src_host OUTPUT STO department
you'll get output with following fields
src_host,dest_host,STO,department
Since there are two hosts in your data, to avoid confusion, you would want to rename the output fields from lookup to differentiate to which host field those are associated with, so you'll do like this
index=foo sourcetype=bar | table srch_host dest_host
| lookup host_info.csv host as src_host OUTPUT STO AS src_STO department AS src_department
The output will now have fields:
src_host,dest_host,src_STO,src_department
Extending it to get information for dest host as well.
index=foo sourcetype=bar | table srch_host dest_host
| lookup host_info.csv host as src_host OUTPUT STO AS src_STO department AS src_department
| lookup host_info.csv host as dest_host OUTPUT STO AS dest_STO department AS dest_department
The output will now have fields:
src_host,dest_host,src_STO,src_department,dest_STO,dest_department
Hope this helps.
Here's a pretty simplistic use case. You have a lookup table called "full_user_names.csv", and it contains three columns: username
, first_name
, and last_name
. A sample row from this lookup table contains jsmith, jane, smith
- so the username jsmith
is mapped to a user whose full name is jane smith
.
You perform this search: index=web_proxy
and it returns events that contain username=jsmith
. You can use the lookup to find the user's full name:
index=web_proxy | lookup full_user_names.csv username OUTPUTNEW first_name, last_name
After the lookup, the event will contain two new fields: first_name=jane
and last_name=smith
.
Now let's imagine you have that same lookup table, but your search returns events that contain local_user=jsmith
(note the field name is now local_user
, which doesn't match the field name username
in your lookup. No problem, you use the AS
clause to fix it:
index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name, last_name
Again, after the lookup, the event will contain two new fields: first_name=jane
and last_name=smith
.
To make matters even more complicated, now you have the same lookup table, and your search returns events that contain local_user=jsmith
, and in order to correlate your events with some other logs, you want the user's first name to be returned into a field named f_name
and last name to be returned into a field named l_name
. Again, no problem - you solve it with the AS
clause again:
index=web_proxy | lookup full_user_names.csv username AS local_user OUTPUTNEW first_name AS f_name, last_name AS l_name
Now, after the lookup, the event will contain two new fields: f_name=jane
and l_name=smith
.
Does not work whatsoever. Like everything in Splunk.
This is by far the most easy-to-digest example of using and understanding the lookup command that I've found. Thank you for taking the time to lay this out. I owe you a beer or a coffee, sir.
Thank you! I'm glad it's useful!
you can use OUTPUT or OUTPUTNEW or neither. if neither are specified, all fields are outputted (aside from the lookup fields, since they are being matched upon)
If OUTPUT is specified, only those fields are outputted and will overwrite existing fields.
if OUTPUTNEW is specified, i believe it basically only brings in the lookup for events which don't already have the output fields.
You aren't looking anything up again, you're basically saying lookup this field in my table to match against my events and output a copy of that field into the events
also, if you ever have questions or comments on the documentation, you can (and are encouraged) to submit feedback at the bottom of the page. the docs team is very responsive and helpful.
hi, do you know any data limitation if we are using "OUTPUT" in the lookup?