Installation

How do you detect deviations in the spelling of a value?

yepyepyayyooo
New Member

Does anyone know of a way to detect deviations in the spelling of a value? For example, for the value domain="google.com", if a value of "go0gle.com" or g00ogl3.com is returned, output results, alert, etc.

Another use case would be if sender="user@domain.com". and a value of "user@d0maine.com" or user@domaiin3.com was returned.

Bonus question: If anyone knows of a way to detect cAsE ObFusCaTiOn that would be great too.

0 Karma
1 Solution

wrangler2x
Motivator

There is an app in Splunkbase which supports Levenshtein distance, Damerau-Levenshtein_distance, Jaro distance, Jaro winkler, match rating comparison, and Hamming distance comparisons, plus a number of phonetic algorithms, including soundex. It is called JellyFisher. Here is a sample Levenshtein distance evaluation using this app:

... | jellyfisher levensthein_distance(sourcetype,source)

What would be returned here is an integer, according to this description of Levenshtein distance.

Each of the JellyFisher functions returns the result in a field named after the function (i.e., levensthein_distance, damerau_levenshtein_distance, soundex).

Here is a link to the JellyFisher app.

I've mocked-up an example of using the Levenshtein distance function the app supports using your three sender examples. This won't run in Splunk unless the app is installed (it installs without restart, and it is quick to install).

| makeresults 
| eval sender1="user@domain.com", sender2="user@d0maine.com", sender3="user@domaiin3.com"
| jellyfisher levenshtein_distance(sender1, sender2)
| rename levenshtein_distance AS sender1sender2diff
| jellyfisher levenshtein_distance(sender1, sender3)
| rename levenshtein_distance AS sender1sender3diff
| jellyfisher levenshtein_distance(sender2, sender3)
| rename levenshtein_distance AS sender2sender3diff
| table sender1 sender2 sender3 sender1sender2diff sender1sender3diff sender2sender3diff

alt text

View solution in original post

0 Karma

wrangler2x
Motivator

There is an app in Splunkbase which supports Levenshtein distance, Damerau-Levenshtein_distance, Jaro distance, Jaro winkler, match rating comparison, and Hamming distance comparisons, plus a number of phonetic algorithms, including soundex. It is called JellyFisher. Here is a sample Levenshtein distance evaluation using this app:

... | jellyfisher levensthein_distance(sourcetype,source)

What would be returned here is an integer, according to this description of Levenshtein distance.

Each of the JellyFisher functions returns the result in a field named after the function (i.e., levensthein_distance, damerau_levenshtein_distance, soundex).

Here is a link to the JellyFisher app.

I've mocked-up an example of using the Levenshtein distance function the app supports using your three sender examples. This won't run in Splunk unless the app is installed (it installs without restart, and it is quick to install).

| makeresults 
| eval sender1="user@domain.com", sender2="user@d0maine.com", sender3="user@domaiin3.com"
| jellyfisher levenshtein_distance(sender1, sender2)
| rename levenshtein_distance AS sender1sender2diff
| jellyfisher levenshtein_distance(sender1, sender3)
| rename levenshtein_distance AS sender1sender3diff
| jellyfisher levenshtein_distance(sender2, sender3)
| rename levenshtein_distance AS sender2sender3diff
| table sender1 sender2 sender3 sender1sender2diff sender1sender3diff sender2sender3diff

alt text

0 Karma

yepyepyayyooo
New Member

Thanks. I'll certainly give it a try.

0 Karma

starcher
Influencer

Look into Levenshtein distance.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...