Splunk Search

Help using lookup tables to select search results

ilhwan
Path Finder

I have a lookup table named ics_special_domains that contains this:

domain_name,type
microsoft.com,microsoft
*.microsoft.com,microsoft
google.com,google
*.google.com,google
nwngms.com,ot
*.nwngms.com,ot
gasco.com,it
*.gasco.com,it

I'm trying to use this in a search to filter on specific types, but I'm trying a basic search first.  This is the most basic search I'm trying:

index=ics_dns ( query_type="A" OR query_type="AAAA" )
| lookup ics_special_domains domain_name as query{} outputnew type as domain_type
| where domain_type="microsoft"

It returns this error:

basic_string::erase: __pos (which is 18446744073709551615) > this->size() (which is 0)

I'd appreciate any help figuring this out.

Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Can you explain the use case in which wildcard must be used?  If all you want to compare is whether a domain matches, you do not need *.microsoft.com.  All you need is a table like

domain_name,type
microsoft.com,microsoft
google.com,google
nwngms.com,ot
gasco.com,it

and trim values of query{} down to this level, like

index=ics_dns ( query_type="A" OR query_type="AAAA" )
| eval l2domain = mvmap(query{}, replace(query{}, "([^\.]+\.[^\.]+)$", "\1"))
| lookup ics_special_domains domain_name as l2domain outputnew type as domain_type
| where domain_type="microsoft"

 The above should work in Splunk 9.  In Splunk 8 and below, you can do

index=ics_dns ( query_type="A" OR query_type="AAAA" )
| mvexpand query{}
| rex field=query{} "(?<l2domain>[^\.]+\.[^\.]+)$"
| lookup ics_special_domains domain_name as l2domain outputnew type as domain_type
| where domain_type="microsoft"

  

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

I think the leading character cannot be wildcard if that field is set up as WILDCARD.

0 Karma

ilhwan
Path Finder

Is there a way I can enter wildcard domains to compare against?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Can you explain the use case in which wildcard must be used?  If all you want to compare is whether a domain matches, you do not need *.microsoft.com.  All you need is a table like

domain_name,type
microsoft.com,microsoft
google.com,google
nwngms.com,ot
gasco.com,it

and trim values of query{} down to this level, like

index=ics_dns ( query_type="A" OR query_type="AAAA" )
| eval l2domain = mvmap(query{}, replace(query{}, "([^\.]+\.[^\.]+)$", "\1"))
| lookup ics_special_domains domain_name as l2domain outputnew type as domain_type
| where domain_type="microsoft"

 The above should work in Splunk 9.  In Splunk 8 and below, you can do

index=ics_dns ( query_type="A" OR query_type="AAAA" )
| mvexpand query{}
| rex field=query{} "(?<l2domain>[^\.]+\.[^\.]+)$"
| lookup ics_special_domains domain_name as l2domain outputnew type as domain_type
| where domain_type="microsoft"

  

johnhuang
Motivator

@yuanliu 

Just want to share a more robust way of extracting domains, esp if it contains a top level country domain or port.

| makeresults | eval input_string="site.taiwan.asus.com.tw"
| append [| makeresults | eval input_string="www.telegraph.co.uk"]
| append [| makeresults | eval input_string="company.splunkcloud.com:8089"]
| rex field=input_string "(?<sld_name_1>[^\.]+\.[^\.]+)$"
| rex field=input_string "(?<sld_name_2>([\w-]+(\.com?\.[a-zA-Z]{2,5}|\.[a-zA-Z]{2,5})))((:\d+)?$)"

 

0 Karma

ilhwan
Path Finder

Yes, that makes sense.  It did just occur to me to trim the query down (except that I was thinking about split() instead of rex).  However, your second query still gives me the same error:

basic_string::erase: __pos (which is 18446744073709551615) > this->size() (which is 0)

There must be a problem with my lookup table.  I did remove the wildcard lines, and it looks like this now:

domain_name,type
microsoft.com,microsoft
google.com,google
nwngms.com,ot
gasco.com,it
0 Karma

ilhwan
Path Finder

It works now.  I had to go back into the lookup definition and remove the WILDCARD match type.  I don't understand how to use that match type, but I have my solution now.  Thanks for your help.

0 Karma
Get Updates on the Splunk Community!

Data Management Digest – December 2025

Welcome to the December edition of Data Management Digest! As we continue our journey of data innovation, the ...

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...