All Apps and Add-ons

How to find similar values in a field?

templier
Communicator

Hello all!
I have a interesting question.
We have a next data:
Two field

a.tudhikova b-antuzh
a.rusevskaya    a_rusevskaya
a.rusevskaya    alishka92

How we can see a.rusevskaya and a_rusevskaya is similar
Question: can we make a request for matching similarity this field?
I understand that there will be errors in the definition, it's not critical.

0 Karma
1 Solution

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

View solution in original post

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

templier
Communicator

@cmerriman hi.
In testing have a trouble:
Have a two address:
a.krikun - akrikunart

And this couple is not similar. Can we modify regex?

0 Karma

cmerriman
Super Champion

You could add an OR statement in the if statement. Haven’t tested that myself yet, though.

|eval similar=if(match(col2,col1) OR match(col1,col2),1,0)
0 Karma

templier
Communicator

I test it - not work.
We have the are many email log from users. And we want see when user send mail to personal email. Very often they are using similar address, few in example in first post, and one more:
v.anasimova - anasimova.v.s

0 Karma

templier
Communicator

Hello,
Wow, it's worked. Many thanks for answer.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

🍂 Fall into November with a fresh lineup of Community Office Hours, Tech Talks, and Webinars we’ve ...

Transform your security operations with Splunk Enterprise Security

Hi Splunk Community, Splunk Platform has set a great foundation for your security operations. With the ...

Splunk Admins and App Developers | Earn a $35 gift card!

Splunk, in collaboration with ESG (Enterprise Strategy Group) by TechTarget, is excited to announce a ...