Installation

How do you detect deviations in the spelling of a value?

yepyepyayyooo
New Member

Does anyone know of a way to detect deviations in the spelling of a value? For example, for the value domain="google.com", if a value of "go0gle.com" or g00ogl3.com is returned, output results, alert, etc.

Another use case would be if sender="user@domain.com". and a value of "user@d0maine.com" or user@domaiin3.com was returned.

Bonus question: If anyone knows of a way to detect cAsE ObFusCaTiOn that would be great too.

0 Karma
1 Solution

wrangler2x
Motivator

There is an app in Splunkbase which supports Levenshtein distance, Damerau-Levenshtein_distance, Jaro distance, Jaro winkler, match rating comparison, and Hamming distance comparisons, plus a number of phonetic algorithms, including soundex. It is called JellyFisher. Here is a sample Levenshtein distance evaluation using this app:

... | jellyfisher levensthein_distance(sourcetype,source)

What would be returned here is an integer, according to this description of Levenshtein distance.

Each of the JellyFisher functions returns the result in a field named after the function (i.e., levensthein_distance, damerau_levenshtein_distance, soundex).

Here is a link to the JellyFisher app.

I've mocked-up an example of using the Levenshtein distance function the app supports using your three sender examples. This won't run in Splunk unless the app is installed (it installs without restart, and it is quick to install).

| makeresults 
| eval sender1="user@domain.com", sender2="user@d0maine.com", sender3="user@domaiin3.com"
| jellyfisher levenshtein_distance(sender1, sender2)
| rename levenshtein_distance AS sender1sender2diff
| jellyfisher levenshtein_distance(sender1, sender3)
| rename levenshtein_distance AS sender1sender3diff
| jellyfisher levenshtein_distance(sender2, sender3)
| rename levenshtein_distance AS sender2sender3diff
| table sender1 sender2 sender3 sender1sender2diff sender1sender3diff sender2sender3diff

alt text

View solution in original post

0 Karma

wrangler2x
Motivator

There is an app in Splunkbase which supports Levenshtein distance, Damerau-Levenshtein_distance, Jaro distance, Jaro winkler, match rating comparison, and Hamming distance comparisons, plus a number of phonetic algorithms, including soundex. It is called JellyFisher. Here is a sample Levenshtein distance evaluation using this app:

... | jellyfisher levensthein_distance(sourcetype,source)

What would be returned here is an integer, according to this description of Levenshtein distance.

Each of the JellyFisher functions returns the result in a field named after the function (i.e., levensthein_distance, damerau_levenshtein_distance, soundex).

Here is a link to the JellyFisher app.

I've mocked-up an example of using the Levenshtein distance function the app supports using your three sender examples. This won't run in Splunk unless the app is installed (it installs without restart, and it is quick to install).

| makeresults 
| eval sender1="user@domain.com", sender2="user@d0maine.com", sender3="user@domaiin3.com"
| jellyfisher levenshtein_distance(sender1, sender2)
| rename levenshtein_distance AS sender1sender2diff
| jellyfisher levenshtein_distance(sender1, sender3)
| rename levenshtein_distance AS sender1sender3diff
| jellyfisher levenshtein_distance(sender2, sender3)
| rename levenshtein_distance AS sender2sender3diff
| table sender1 sender2 sender3 sender1sender2diff sender1sender3diff sender2sender3diff

alt text

0 Karma

yepyepyayyooo
New Member

Thanks. I'll certainly give it a try.

0 Karma

starcher
Influencer

Look into Levenshtein distance.

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...