Splunk Search

how to use string from lookup table in search

afulamba
Explorer

Hi there,
I want to build a query with strings from the lookup table. I have the list of domains in the look up table that I want to use in the query to match as hostname
base query:
index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" | eval hostname=mvindex(split(....))

lookup
[| inputlookup domain.csv
| fields domain

Domain in the lookup are the hostname in the base search.
Can someone advise on this, please?

Tags (1)
0 Karma
1 Solution

dmarling
Builder

This should return results:

index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" 
| eval hostname=mvindex(split(....)) 
| search 
    [| inputlookup domain.csv 
    | eval hostname=domain
    | table hostname 
    | format]

You need to make the name of the field that contains the data you want match the name of the field it will be running that search against. The format command will then format the results of the lookup into SPL that can be executed on a search line.

If this comment/answer was helpful, please up vote it. Thank you.

View solution in original post

afulamba
Explorer

Here is the sample log event and the lookup

First query for hostname from the logs
index=web host=1 OR host=2 "*HTTP/1.1 3*" "*abc*"
| eval hostname=mvindex(split(mvindex(split(_raw,"&HOSTNAME="),1),"&AUTHNLEVEL="),0)
| eval hostname=trim(replace(hostname,"www.",""))

10.154.248.238 - - [03/May/2019:06:49:35.367 -0500] GET /xxx/rest/services/public/abc?ERROR_CODE=0x00000000&ERROR_TEXT=BA0521I%20%20%20Successful%20completion&HOSTNAME=www.yyy.us&AUTHNLEVEL= HTTP/1.1 307 - https://www.yyy.us/logout Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) userid=- sec=0 usec=13508 fwd=,- route=route0

Lookup table
| inputlookup domain.csv

domain
yyy.us
yyy.at
yyy.ae

So in this example if I see yyy.us in the both the result I am expecting this when I combine the query in one.

0 Karma

dmarling
Builder

Can you run this search and post the results for just those three domains you have listed and post the results?

| inputlookup domain.csv
| eval hostname=domain
| table hostname
| format

It should return results that look like this: ( ( hostname="yyy.us" ) OR ( hostname="yyy.at" ) OR ( hostname="yyy.ae" ) )

I got this to work with a run anywhere example using the data you provided. Here's the query that shows your main query and the results:

| makeresults count=1
| eval _raw="10.154.248.238 - - [03/May/2019:06:49:35.367 -0500] GET /xxx/rest/services/public/abc?ERROR_CODE=0x00000000&ERROR_TEXT=BA0521I%20%20%20Successful%20completion&HOSTNAME=www.yyy.us&AUTHNLEVEL= HTTP/1.1 307 - https://www.yyy.us/logout Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) userid=- sec=0 usec=13508 fwd=,- route=route0"
| rex field=_raw "\[(?<timestamp>[^\]]+)"
| eval _time=strptime(timestamp, "%d/%b/%Y:%H:%M:%S.%3N%z")
| fields - timestamp
| eval hostname=mvindex(split(mvindex(split(_raw,"&HOSTNAME="),1),"&AUTHNLEVEL="),0)
| eval hostname=trim(replace(hostname,"www.",""))

Your results should look like this:
_raw
10.154.248.238 - - [03/May/2019:06:49:35.367 -0500] GET /xxx/rest/services/public/abc?ERROR_CODE=0x00000000&ERROR_TEXT=BA0521I%20%20%20Successful%20completion&HOSTNAME=www.yyy.us&AUTHNLEVEL= HTTP/1.1 307 - https://www.yyy.us/logout Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) userid=- sec=0 usec=13508 fwd=,- route=route0
_time
2019-05-03 07:49:35.367
hostname
yyy.us

This run anywhere query mimics your lookup file and produces the results I would expect with the format command:

| makeresults count=1
| fields - _time
| eval domain="yyy.us~yyy.at~yyy.ae"
| makemv domain delim="~"
| mvexpand domain
| eval hostname=domain
| table hostname
| format

You should see this under the search header: ( ( hostname="yyy.us" ) OR ( hostname="yyy.at" ) OR ( hostname="yyy.ae" ) )

Combining them into a single search that will return any hostname that exists in that lookup:

| makeresults count=1
| eval _raw="10.154.248.238 - - [03/May/2019:06:49:35.367 -0500] GET /xxx/rest/services/public/abc?ERROR_CODE=0x00000000&ERROR_TEXT=BA0521I%20%20%20Successful%20completion&HOSTNAME=www.yyy.us&AUTHNLEVEL= HTTP/1.1 307 - https://www.yyy.us/logout Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) userid=- sec=0 usec=13508 fwd=,- route=route0"
| rex field=_raw "\[(?<timestamp>[^\]]+)"
| eval _time=strptime(timestamp, "%d/%b/%Y:%H:%M:%S.%3N%z")
| fields - timestamp
| eval hostname=mvindex(split(mvindex(split(_raw,"&HOSTNAME="),1),"&AUTHNLEVEL="),0)
| eval hostname=trim(replace(hostname,"www.",""))
| search [| makeresults count=1
| fields - _time
| eval domain="yyy.us~yyy.at~yyy.ae"
| makemv domain delim="~"
| mvexpand domain
| eval hostname=domain
| table hostname
| format]

You can change the data after HOSTNAME=www to confirm that the filter is functioning with any three of the domains in the lookup as well as see it return no results if you make it something not in that lookup.

If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

dmarling
Builder

This should return results:

index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" 
| eval hostname=mvindex(split(....)) 
| search 
    [| inputlookup domain.csv 
    | eval hostname=domain
    | table hostname 
    | format]

You need to make the name of the field that contains the data you want match the name of the field it will be running that search against. The format command will then format the results of the lookup into SPL that can be executed on a search line.

If this comment/answer was helpful, please up vote it. Thank you.

afulamba
Explorer

I had to change the field name in the lookup from sf-domain to whitelistdomain and it worked. It seems "-" in the field name was the culprit which was not evaluating the host name. I had not given the actual domain names and the actual lookup details for security reason. My bad, we could have identified the issue earlier otherwise.
Thanks a lot for your help!!!!!!!

0 Karma

dmarling
Builder

You're welcome! Just as an FYI you can fix the hyphen causing the issue with either using a rename command with double quotes around the field name | rename "sf-domain" as hostname or use a single quote around the field name when doing an eval | eval hostname='sf-domain'

If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

afulamba
Explorer

Yup, sure. Thanks for all your help.

0 Karma

afulamba
Explorer

If I run the these 2 subsearch separately, I see the hostname value return from first search also returns from the look up from the second search, however, when run the search together as above, it doesn't return the data. Did it work for anybody for such scenario?
Thanks for the input.

0 Karma

dmarling
Builder

If could provide an example of the events that you are looking at (with any private information removed) form your main query and some examples of the events in your lookup file I can give you a more details query using a run anywhere example.

If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

afulamba
Explorer

Any more suggesstions please?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Your lookup should look like this

index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" | eval hostname=mvindex(split(....))
| lookup domain.csv domain AS host OUTPUT <Column name in csv you want to output>

So if you had another column name that was titled URL, your lookup would look like this

 index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" | eval hostname=mvindex(split(....))
| lookup domain.csv domain AS host OUTPUT URL
0 Karma

afulamba
Explorer

I tried both as suggested below but doesn't work as expected.

This returns no result
| inputlookup domain.csv
| rename domain AS hostname
| join hostname [| search
index=web host=1 OR host=2 sourcetype="apache:access" "*HTTP/1.1 5*" "*abc*"
| eval hostname=mvindex(split(mvindex(split(_raw,"xxxx="),1),"&yyyyy"),0) | stats count by hostname ]

This returns result even the domain value is not in the lookup.
[| inputlookup domain.csv | stats values(domain) as query ]
index=web host=1 OR host=2 sourcetype="apache:access" "*HTTP/1.1 5*" "*abc*" | eval hostname=mvindex(split(mvindex(split(_raw,"xxx"),1),"yyyyy"),0)

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Why are you renaming it hostname? It has to join on a common field from the lookup to the events in Splunk. host is the default field in Splunk. Keep it as host and it will work

0 Karma

dmarling
Builder

If I'm reading his original post correctly, the domain lookup file equates to the hostname eval he is extracting in his search, not the host that the access log originated from.

If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

afulamba
Explorer

May be the labels I used are confusing.
base query example:
index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" | eval hostname=mvindex(split(....))
Here the derived hostname value could be abc.com, def.com etc.
In the look up table for domain column I have listed abc.com, def.com etc.
So my need is that the search should return the result if the derived hostname matches with domain in the lookup. To follow your approach I would need to add another column in the lookup something relevant to it otherwise. Please suggest.

0 Karma

afulamba
Explorer

Thats correct. whatever derived as hostname from logs, match it with domain in the lookup table.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Try something like this

| inputlookup domain.csv
| rename domain AS host
| join host
    [| search index=* host=1 OR host=1 "HTTP/1.1 30*" "abc" | stats count by host | fields + host ]
0 Karma

skoelpin
SplunkTrust
SplunkTrust

So you want to iterate through a lookup table and see what matches from the lookup and whats in Splunk?

0 Karma

afulamba
Explorer

Thanks for the input. Actually I have single column look up table with column name 'domain', no other column for host.
To avoid the list of domains in the base query that looks ugly, I want to keep the fix list of domains in the table and I can add new domains as and when required in the table. However, the hostname derived from the logs in the base query can be any value based on the client domain. I want to filter the requests in the apache logs which have the hostname exist in lookup table.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

I'm not sure I understand. Your lookup table only has a single column? Lookups are used to enrich your data in Splunk, here's an example of how it works

You have a lookup with 2 columns, first column is labeled host and second column is labeled domain. There is no domain fields in Splunk and you want to map the domain to each host. Since your hostnames in Splunk match the hostnames in the lookup, you can link those hostnames together to map the domain field. The logic looks like this

| lookup domain.csv <field that matches your lookup to date> OUTPUT <field that you have in lookup but not logs>

| lookup domain.csv host OUTPUT domain
0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...