All Apps and Add-ons

How to find similar values in a field?

templier
Communicator

Hello all!
I have a interesting question.
We have a next data:
Two field

a.tudhikova b-antuzh
a.rusevskaya    a_rusevskaya
a.rusevskaya    alishka92

How we can see a.rusevskaya and a_rusevskaya is similar
Question: can we make a request for matching similarity this field?
I understand that there will be errors in the definition, it's not critical.

0 Karma
1 Solution

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

View solution in original post

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

templier
Communicator

@cmerriman hi.
In testing have a trouble:
Have a two address:
a.krikun - akrikunart

And this couple is not similar. Can we modify regex?

0 Karma

cmerriman
Super Champion

You could add an OR statement in the if statement. Haven’t tested that myself yet, though.

|eval similar=if(match(col2,col1) OR match(col1,col2),1,0)
0 Karma

templier
Communicator

I test it - not work.
We have the are many email log from users. And we want see when user send mail to personal email. Very often they are using similar address, few in example in first post, and one more:
v.anasimova - anasimova.v.s

0 Karma

templier
Communicator

Hello,
Wow, it's worked. Many thanks for answer.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...