I'm not sure whether or not this is a unique problem, but I'm hoping someone can help even if I'm overlooking an obvious solution :-).
I have a lookup table that is a domain whitelist that we allow through our proxies. For example, let's pretend a portion of this lookup table is like this (keeping in mind that some of the whitelisted domains might be sub-domains):
uri_host
--------
google.com
amazon.com
yahoo.com
answers.splunk.com
.
.
.
What I'm trying to figure out is if there is a way to not only use this lookup table to search across the proxy logs, but also add a field to each resulting event called, say, "match_string" that contains the value from the lookup table that caused the event to match.
For example, if in the proxy logs there are events of people browsing to "maps.google.com" and "images.google.com", those would match my whitelist due to "google.com" being there, but I want to somehow tie that back to the lookup table so that I know it shows up in the results because it matched against "google.com". The results of this might look like:
uri_host match_string
-------- ------------
maps.google.com google.com
images.google.com google.com
mail.yahoo.com yahoo.com
answers.splunk.com answers.splunk.com
Hopefully that explains what I'm trying to do well enough, and thank you in advance to anyone who can help!
This is a good use-case for Wildcard lookup. See this similar answer for more details
https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html
Basically, have your lookup table as this (say domainlookup.csv)
uri_host, match_string
*google.com,google.com
*amazon.com,amazon.com
*yahoo.com,yahoo.com
*answers.splunk.com,answers.splunk.com
have your transforms.conf
with this
[domainlookup]
filename = domainlookup.csv
match_type = WILDCARD(uri_host)
Now you can add a lookup command to your search OR setup automatic lookup (to add the field match_string automatically to each events of yoursourcetype)
props.conf
[yoursourcetype]
..other settings...
LOOKUP-domain= domainlookup uri_host OUTPUT match_string
This is a good use-case for Wildcard lookup. See this similar answer for more details
https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html
Basically, have your lookup table as this (say domainlookup.csv)
uri_host, match_string
*google.com,google.com
*amazon.com,amazon.com
*yahoo.com,yahoo.com
*answers.splunk.com,answers.splunk.com
have your transforms.conf
with this
[domainlookup]
filename = domainlookup.csv
match_type = WILDCARD(uri_host)
Now you can add a lookup command to your search OR setup automatic lookup (to add the field match_string automatically to each events of yoursourcetype)
props.conf
[yoursourcetype]
..other settings...
LOOKUP-domain= domainlookup uri_host OUTPUT match_string
I think I have things set up as you have suggested, but I'm running into an issue where nothing is actually outputting for the match_string field. I have the config added to the transforms.conf file (but did not add anything into props.conf since I'm not doing an automatic lookup).
Then I'm running the following search, but match_string is blank:
index=proxy_logs | lookup domainlookup uri_host OUTPUT match_string | table uri_host, match_string
this should work.
is it possible that you don't have a field named exactly "uri_host" in your events?
also, if you post your related props.conf & transforms.conf stanzas along with your lookup file definition & sample we can help with debugging a bit more.
I'm able to make it work with same settings in my test machine. Can you try to run following query in your instance and let me know if you see a value for the field match_string?
| gentimes start=-1 | eval uri_host="maps.google.com" | table uri_host | lookup domainlookup uri_host
WOW! Sometimes it's the smallest, dumbest things that trip you up... I was in the process of typing up the relevant part of the transforms.conf as well as a sample of the lookup csv, when I realized that the lookup table had quotes around the field names, so "match_string" instead of just match_string.
I fixed the lookup table and now everything works as expected. Sheesh. 🙂
Awesome. Thank you for posting back your results and what lead to your problem. Those pieces of information will help somebody else in the future.