Splunk Search

Regex matching in a Splunk search query that involves a lookup table

frankeke
Loves-to-Learn

I have created a lookup table in Splunk that contains a column with various regex patterns intended to match file paths. My goal is to use this lookup table within a search query to identify events where the path field matches any of the regex patterns specified in the Regex_Path column.

lookupfile:

frankeke_0-1733190952823.png

 

Here is the challenge I'm facing:

  • When using the match() function in my search query, it only successfully matches if the Regex_Path pattern completely matches the path field in the event. However, I expected match() to perform partial matches based on the regex pattern, which does not seem to be the case.

  • Interestingly, if I manually replace the Regex_Path in the where match() clause with the actual regex pattern, it successfully performs the match as expected.

Here is an example of my search query:

index=teleport event="sftp" path!=""
| eval path_lower=lower(path)
| lookup Sensitive_File_Path.csv Regex_Path AS path_lower OUTPUT Regex_Path, Note
| where match(path_lower, Regex_Path)
| table path_lower, Regex_Path, Note

I would like to understand why the match() function isn't working as anticipated when using the lookup table and whether there is a better method to achieve the desired regex matching.

Any insights or suggestions on how to resolve this issue would be greatly appreciated.

Labels (3)
0 Karma

frankeke
Loves-to-Learn

thanks, the definition need global permission?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Give it the permissions you want for its visibility

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You cannot use regex matching in lookups. Lookup wildcards only support * and that is when you create a lookup definition and use the advanced options to set WILDCARD(Regex_Path). You are using a lookup file, not the definition.

So the lookup must match exactly or when you have a * e.g. /home/ubuntu/* for a wildcarded version

but then you would have to have another column with the real regex, note that c:\boot.ini is not valid regex, due to the \ which needs to be escaped.

 

0 Karma

frankeke
Loves-to-Learn

Thank you for your response. Since regex cannot be used in lookups and now we defining everything within correlation searches which can be cumbersome for updates, Is there any alternative solutions? Are there more efficient ways to detect suspicious command execution without relying solely on correlation searches? Your guidance on streamlining this process would be greatly appreciated.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Technically you can work with regexes defined in lookups by doing something like this

| eval enabled=1
| lookup regex_list.csv enabled OUTPUT regex
| eval match=mvmap(regex, if(match(path, regex), regex, null()))

where your csv contains 2 columns, the regex and a column called enabled with a value of 1.

This will pull ALL regexes into each event and then using mvmap will map the path against each of the regexes individually - for each match it will add the matching regex to the match field. After the mvmap, you will have a potentially multivalue field 'match' with one or more matches. If match is null, then there were no matches, so

| where isnotnull(match)

will filter out non matching paths.

This is not using a lookup as a lookup, but simply using the lookup as a repository of matches which you "load" to each event during the pipeline.

Depending on how many regexes you have it may be an option or not.

 

 

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...