All Apps and Add-ons

Fuzzy Search for Splunk: Search not parsing hyphens and spaces

joshfraser123
New Member

I am running into an issue with the add-on "Fuzzy search for splunk", I am trying to use it to find malicious process names that are similar to a legitimate one, the issue I have is that the add-on can't seem to parse through hyphens and spaces. The below search will give me a 100% match with "legit-unique-services.exe" and "legit unique services.exe". There is a number of legitimate processes that are like this.

 | fuzzy wordlist="services.exe" compare_field=process_name

Is there anything I can do to fix this? Or does the add-on have to be updated to handle this?

0 Karma

jlanders
Path Finder

Most likely the issue you are running into is with the "delims" option. From the add on readme:

Delims accepts a regex string, escaped splunk style, and defaults to (\\\\|/|\s+|;|-)

So if you were to pass in a different delimiter to the command like:

| fuzzy wordlist="list.exe" delims="(\\\\)" compare_field=process_name

You may get better results.

0 Karma

joshfraser123
New Member

Thanks mate, using that delimiter will mean that we just exclude process names that have hyphens/spaces in them entirely. Is there a way to include them and get them to match?

0 Karma

jlanders
Path Finder

I looked at the code again to make sure I wasn't speaking out of turn and it basically works like this:

>>> import re
>>> pattern='(\\\\|/|\s+|;|-)'
>>> testdata='this-is-a-test.txt'
>>> matches=re.split(pattern, testdata)
>>> matches
['this', '-', 'is', '-', 'a', '-', 'test.txt']


>>> pattern='(\\\\)'
>>> testdata='this-is-a-test.txt'
>>> matches=re.split(pattern, testdata)
>>> matches
['this-is-a-test.txt']

The delims value is just a splitter, not a filtering mechanism despite the bad variable naming I used in the script. I'll have to rename that later on... At any rate, modifying the delims to not include the hyphen should solve your issue.

0 Karma

jlanders
Path Finder

That delims option tells the command how to split up stuff that gets put into the command. For example, if you have an input 'this-is-my-process.exe', the default value splits this into multiple words and compares your wordlist to each word: "this", "is", "my", and "process.exe".

By changing the delims value, you can change this behavior so that "this-is-my-process.exe" is evaluated as a whole word.

The command shouldnt be excluding anything based on the provided delims. I'll check the code later to validate 100%.

0 Karma
Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...