Splunk Cloud Platform

Regex match in a Splunk lookup

nabeel652
Builder

Hello great Splunk community

I have a requirement to match fqdn's based on regex. So:

I have a wildcard lookup that has FQDN's and say, another field "match" which is true in the case of a successful match.

For example in the lookup I have an entry

*.website.com

 I want to match anything that is abc.website.com or xyz.website.com

However,

I don't want to match stuff that is in the form of def.abc.website.com or uvw.xyz.website.com.

Wildcard tends not to care about anything before the as long as the string after * matches. Any ideas how to do that? 

Labels (1)
Tags (3)
0 Karma

tscroggins
Champion

Hi @nabeel652,

Here's an alternative using a single eval, which you can implement as a macro:

wildcard_domains.csv

domain
*.example.com

transforms.conf

[wildcard_domains]
batch_index_query = 0
case_sensitive_match = 0
filename = wildcard_domains.csv
match_type = WILDCARD(domain)
max_matches = 1
min_matches = 0

SPL

| makeresults format=csv data="
domain
foo.example.com
bar.example.com
bar.baz.example.com
"
| eval wildcard_domain_match=if(match(domain, replace(replace(spath(lookup("wildcard_domains", json_object("domain", domain), json_array("domain")), "domain"), "\\.", "\\."), "^\\*", "^[^.]+")), 1, 0)

Result

domain	               wildcard_domain_match
foo.example.com	                           1
bar.example.com	                           1
bar.baz.example.com                        0

As @PickleRick noted, the order of the entries in wildcard_domains.csv is important. The first match "wins."

If I were doing this for myself, I would write an external lookup and use a standard or reference library for wildcard label matching rules relative to the use case: DNS, TLS, X.509, etc. Those examples follow the same general rules for leftmost label matching, but you may have other requirements.

(Edited to remove mvmap. I started writing a response with max_matches >= 1. I can provide an example if you need one.)

Tags (2)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

As PickleRick says, you can't do regex in lookup directly, but I've used his technique through a programatically managed/ordered lookup, where I have ordered the results based on wildcards, so that first hit is counted only.

If your rule is sufficiently tight that the wildcards is matching domain segments rather parts of the segment, then you could do post matching on the part counts

| makeresults
| eval fqdn1="abc.website.com"
| eval result="*.website.com" ``` lookup returns ```
| eval fqdn1_count=mvcount(split(fqdn1, ".")), result_count=mvcount(split(result, "."))
| eval match1=if(fqdn1_count=result_count, "HIT", "MISS")

| eval fqdn2="def.abc.website.com"
| eval result="*.website.com" ``` lookup returns ```
| eval fqdn2_count=mvcount(split(fqdn2, ".")), result_count=mvcount(split(result, "."))
| eval match2=if(fqdn2_count=result_count, "HIT", "MISS")

You'd have to do a bit more work to manage multiple results, e.g. if you have *.*.website.com in your lookup, a lookup for def.abc.website.com would get 2 hits.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Lookups don't use regexes for matching (unless you're talking about an "external lookup" - there you can implement everything you like).

But.

If we're talking about csv-based lookups the entries are inspected in the order they're written in the file. So if you set maximum matches to 1 and give a negative match early in the file you can prevent Splunk from matching positively later in the file. It's kinda ACL-like behaviour.

But yes, that's ugly.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...