Installation

How do you detect deviations in the spelling of a value?

yepyepyayyooo
New Member

Does anyone know of a way to detect deviations in the spelling of a value? For example, for the value domain="google.com", if a value of "go0gle.com" or g00ogl3.com is returned, output results, alert, etc.

Another use case would be if sender="user@domain.com". and a value of "user@d0maine.com" or user@domaiin3.com was returned.

Bonus question: If anyone knows of a way to detect cAsE ObFusCaTiOn that would be great too.

0 Karma
1 Solution

wrangler2x
Motivator

There is an app in Splunkbase which supports Levenshtein distance, Damerau-Levenshtein_distance, Jaro distance, Jaro winkler, match rating comparison, and Hamming distance comparisons, plus a number of phonetic algorithms, including soundex. It is called JellyFisher. Here is a sample Levenshtein distance evaluation using this app:

... | jellyfisher levensthein_distance(sourcetype,source)

What would be returned here is an integer, according to this description of Levenshtein distance.

Each of the JellyFisher functions returns the result in a field named after the function (i.e., levensthein_distance, damerau_levenshtein_distance, soundex).

Here is a link to the JellyFisher app.

I've mocked-up an example of using the Levenshtein distance function the app supports using your three sender examples. This won't run in Splunk unless the app is installed (it installs without restart, and it is quick to install).

| makeresults 
| eval sender1="user@domain.com", sender2="user@d0maine.com", sender3="user@domaiin3.com"
| jellyfisher levenshtein_distance(sender1, sender2)
| rename levenshtein_distance AS sender1sender2diff
| jellyfisher levenshtein_distance(sender1, sender3)
| rename levenshtein_distance AS sender1sender3diff
| jellyfisher levenshtein_distance(sender2, sender3)
| rename levenshtein_distance AS sender2sender3diff
| table sender1 sender2 sender3 sender1sender2diff sender1sender3diff sender2sender3diff

alt text

View solution in original post

0 Karma

wrangler2x
Motivator

There is an app in Splunkbase which supports Levenshtein distance, Damerau-Levenshtein_distance, Jaro distance, Jaro winkler, match rating comparison, and Hamming distance comparisons, plus a number of phonetic algorithms, including soundex. It is called JellyFisher. Here is a sample Levenshtein distance evaluation using this app:

... | jellyfisher levensthein_distance(sourcetype,source)

What would be returned here is an integer, according to this description of Levenshtein distance.

Each of the JellyFisher functions returns the result in a field named after the function (i.e., levensthein_distance, damerau_levenshtein_distance, soundex).

Here is a link to the JellyFisher app.

I've mocked-up an example of using the Levenshtein distance function the app supports using your three sender examples. This won't run in Splunk unless the app is installed (it installs without restart, and it is quick to install).

| makeresults 
| eval sender1="user@domain.com", sender2="user@d0maine.com", sender3="user@domaiin3.com"
| jellyfisher levenshtein_distance(sender1, sender2)
| rename levenshtein_distance AS sender1sender2diff
| jellyfisher levenshtein_distance(sender1, sender3)
| rename levenshtein_distance AS sender1sender3diff
| jellyfisher levenshtein_distance(sender2, sender3)
| rename levenshtein_distance AS sender2sender3diff
| table sender1 sender2 sender3 sender1sender2diff sender1sender3diff sender2sender3diff

alt text

0 Karma

yepyepyayyooo
New Member

Thanks. I'll certainly give it a try.

0 Karma

starcher
Influencer

Look into Levenshtein distance.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...