All Apps and Add-ons

How to find similar values in a field?

templier
Communicator

Hello all!
I have a interesting question.
We have a next data:
Two field

a.tudhikova b-antuzh
a.rusevskaya    a_rusevskaya
a.rusevskaya    alishka92

How we can see a.rusevskaya and a_rusevskaya is similar
Question: can we make a request for matching similarity this field?
I understand that there will be errors in the definition, it's not critical.

0 Karma
1 Solution

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

View solution in original post

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

View solution in original post

templier
Communicator

@cmerriman hi.
In testing have a trouble:
Have a two address:
a.krikun - akrikunart

And this couple is not similar. Can we modify regex?

0 Karma

cmerriman
Super Champion

You could add an OR statement in the if statement. Haven’t tested that myself yet, though.

|eval similar=if(match(col2,col1) OR match(col1,col2),1,0)
0 Karma

templier
Communicator

I test it - not work.
We have the are many email log from users. And we want see when user send mail to personal email. Very often they are using similar address, few in example in first post, and one more:
v.anasimova - anasimova.v.s

0 Karma

templier
Communicator

Hello,
Wow, it's worked. Many thanks for answer.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!