I have created a lookup table in Splunk that contains a column with various regex patterns intended to match file paths. My goal is to use this lookup table within a search query to identify events where the path field matches any of the regex patterns specified in the Regex_Path column.
lookupfile:
Here is the challenge I'm facing:
When using the match() function in my search query, it only successfully matches if the Regex_Path pattern completely matches the path field in the event. However, I expected match() to perform partial matches based on the regex pattern, which does not seem to be the case.
Interestingly, if I manually replace the Regex_Path in the where match() clause with the actual regex pattern, it successfully performs the match as expected.
Here is an example of my search query:
index=teleport event="sftp" path!="" | eval path_lower=lower(path) | lookup Sensitive_File_Path.csv Regex_Path AS path_lower OUTPUT Regex_Path, Note | where match(path_lower, Regex_Path) | table path_lower, Regex_Path, Note
I would like to understand why the match() function isn't working as anticipated when using the lookup table and whether there is a better method to achieve the desired regex matching.
Any insights or suggestions on how to resolve this issue would be greatly appreciated.
thanks, the definition need global permission?
Give it the permissions you want for its visibility
You cannot use regex matching in lookups. Lookup wildcards only support * and that is when you create a lookup definition and use the advanced options to set WILDCARD(Regex_Path). You are using a lookup file, not the definition.
So the lookup must match exactly or when you have a * e.g. /home/ubuntu/* for a wildcarded version
but then you would have to have another column with the real regex, note that c:\boot.ini is not valid regex, due to the \ which needs to be escaped.
Thank you for your response. Since regex cannot be used in lookups and now we defining everything within correlation searches which can be cumbersome for updates, Is there any alternative solutions? Are there more efficient ways to detect suspicious command execution without relying solely on correlation searches? Your guidance on streamlining this process would be greatly appreciated.
Technically you can work with regexes defined in lookups by doing something like this
| eval enabled=1
| lookup regex_list.csv enabled OUTPUT regex
| eval match=mvmap(regex, if(match(path, regex), regex, null()))
where your csv contains 2 columns, the regex and a column called enabled with a value of 1.
This will pull ALL regexes into each event and then using mvmap will map the path against each of the regexes individually - for each match it will add the matching regex to the match field. After the mvmap, you will have a potentially multivalue field 'match' with one or more matches. If match is null, then there were no matches, so
| where isnotnull(match)
will filter out non matching paths.
This is not using a lookup as a lookup, but simply using the lookup as a repository of matches which you "load" to each event during the pipeline.
Depending on how many regexes you have it may be an option or not.