All Apps and Add-ons

How to find similar values in a field?

templier
Communicator

Hello all!
I have a interesting question.
We have a next data:
Two field

a.tudhikova b-antuzh
a.rusevskaya    a_rusevskaya
a.rusevskaya    alishka92

How we can see a.rusevskaya and a_rusevskaya is similar
Question: can we make a request for matching similarity this field?
I understand that there will be errors in the definition, it's not critical.

0 Karma
1 Solution

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

View solution in original post

cmerriman
Super Champion

try using the match command
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/ConditionalFunctions#match.28SUBJE...

|eval similar=if(match(col2,col1),1,0)

here is sample code using your data above

|makeresults|eval data="col1='a.tudhikova',col2='b-antuzh' col1='a.rusevskaya',col2='a_rusevskaya' col1='a.rusevskaya',col2='alishka92'"|makemv data|mvexpand data|rename data as _raw|kv|rex mode=sed field=col1 "s/'//g"|rex mode=sed field=col2 "s/'//g"|eval similar=if(match(col2,col1),1,0)

templier
Communicator

@cmerriman hi.
In testing have a trouble:
Have a two address:
a.krikun - akrikunart

And this couple is not similar. Can we modify regex?

0 Karma

cmerriman
Super Champion

You could add an OR statement in the if statement. Haven’t tested that myself yet, though.

|eval similar=if(match(col2,col1) OR match(col1,col2),1,0)
0 Karma

templier
Communicator

I test it - not work.
We have the are many email log from users. And we want see when user send mail to personal email. Very often they are using similar address, few in example in first post, and one more:
v.anasimova - anasimova.v.s

0 Karma

templier
Communicator

Hello,
Wow, it's worked. Many thanks for answer.

0 Karma
Get Updates on the Splunk Community!

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...

Splunkbase | Splunk Dashboard Examples App for SimpleXML End of Life

The Splunk Dashboard Examples App for SimpleXML will reach end of support on Dec 19, 2024, after which no new ...

Understanding Generative AI Techniques and Their Application in Cybersecurity

Watch On-Demand Artificial intelligence is the talk of the town nowadays, with industries of all kinds ...