Hello great Splunk community
I have a requirement to match fqdn's based on regex. So:
I have a wildcard lookup that has FQDN's and say, another field "match" which is true in the case of a successful match.
For example in the lookup I have an entry
*.website.com I want to match anything that is abc.website.com or xyz.website.com
However,
I don't want to match stuff that is in the form of def.abc.website.com or uvw.xyz.website.com.
Wildcard tends not to care about anything before the as long as the string after * matches. Any ideas how to do that?
Hi @nabeel652,
Here's an alternative using a single eval, which you can implement as a macro:
wildcard_domains.csv
domain
*.example.comtransforms.conf
[wildcard_domains]
batch_index_query = 0
case_sensitive_match = 0
filename = wildcard_domains.csv
match_type = WILDCARD(domain)
max_matches = 1
min_matches = 0SPL
| makeresults format=csv data="
domain
foo.example.com
bar.example.com
bar.baz.example.com
"
| eval wildcard_domain_match=if(match(domain, replace(replace(spath(lookup("wildcard_domains", json_object("domain", domain), json_array("domain")), "domain"), "\\.", "\\."), "^\\*", "^[^.]+")), 1, 0)Result
domain wildcard_domain_match
foo.example.com 1
bar.example.com 1
bar.baz.example.com 0As @PickleRick noted, the order of the entries in wildcard_domains.csv is important. The first match "wins."
If I were doing this for myself, I would write an external lookup and use a standard or reference library for wildcard label matching rules relative to the use case: DNS, TLS, X.509, etc. Those examples follow the same general rules for leftmost label matching, but you may have other requirements.
(Edited to remove mvmap. I started writing a response with max_matches >= 1. I can provide an example if you need one.)
As PickleRick says, you can't do regex in lookup directly, but I've used his technique through a programatically managed/ordered lookup, where I have ordered the results based on wildcards, so that first hit is counted only.
If your rule is sufficiently tight that the wildcards is matching domain segments rather parts of the segment, then you could do post matching on the part counts
| makeresults
| eval fqdn1="abc.website.com"
| eval result="*.website.com" ``` lookup returns ```
| eval fqdn1_count=mvcount(split(fqdn1, ".")), result_count=mvcount(split(result, "."))
| eval match1=if(fqdn1_count=result_count, "HIT", "MISS")
| eval fqdn2="def.abc.website.com"
| eval result="*.website.com" ``` lookup returns ```
| eval fqdn2_count=mvcount(split(fqdn2, ".")), result_count=mvcount(split(result, "."))
| eval match2=if(fqdn2_count=result_count, "HIT", "MISS")You'd have to do a bit more work to manage multiple results, e.g. if you have *.*.website.com in your lookup, a lookup for def.abc.website.com would get 2 hits.
Lookups don't use regexes for matching (unless you're talking about an "external lookup" - there you can implement everything you like).
But.
If we're talking about csv-based lookups the entries are inspected in the order they're written in the file. So if you set maximum matches to 1 and give a negative match early in the file you can prevent Splunk from matching positively later in the file. It's kinda ACL-like behaviour.
But yes, that's ugly.