Knowledge Management

Fuzzy Logic match with multi word value

mjones414
Contributor

I have a situation where I'm using case to compare 2 fields to identify a fuzzy match, but in field 1 I may have "boa.com" and in field 2 I have "Bank Of America"  what I want to do is to take the letters of field 1 and the first letter of each word in field 2 (understanding there is no potential maximum number of words the value may contain).  I know I can usually do something with mvindex by using an index field of -1 to identify the "last value" of a multi value field, but I'm not sure how to try to marry that with case(like and substr().  Has anyone ever accomplished anything like this before?

 

I'm trying things like | rex field=Company "(?<CamelCase>\b(\w))" but its only returning "b" in CamelCase instead of "boa"

Tags (3)
0 Karma
1 Solution

mjones414
Contributor

I was just about to come on here and post that I figured it out, but what I was doing isn't as elegant as what you did.

I did 

| makemv CompanyName
| rex field=CompanyName "(?<CamelCase>\b(\w))"
| eval CamelCase=mvjoin(CamelCase,"")
| nomv CompanyName
| eval DomainMatchesCompany=case(like(lower(CompanyName),"%".substr(lower(domain_root),1,3)."%"),"Yes",
like(lower(CamelCase),"%".substr(lower(domain_root),1,3)."%"),"Yes", 1=1,"No")


I will try your Approach and see if I get something similar

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Similar to this response, try something like this

| rex max_match=0 field=field2 "(?<initial>[a-zA-Z])[a-zA-Z]* ?"
| eval webdomain=lower(mvjoin(initial,"")).".com"

 

mjones414
Contributor

I was just about to come on here and post that I figured it out, but what I was doing isn't as elegant as what you did.

I did 

| makemv CompanyName
| rex field=CompanyName "(?<CamelCase>\b(\w))"
| eval CamelCase=mvjoin(CamelCase,"")
| nomv CompanyName
| eval DomainMatchesCompany=case(like(lower(CompanyName),"%".substr(lower(domain_root),1,3)."%"),"Yes",
like(lower(CamelCase),"%".substr(lower(domain_root),1,3)."%"),"Yes", 1=1,"No")


I will try your Approach and see if I get something similar

0 Karma

mjones414
Contributor

So as it turns out with regard to my data, word boundaries and \w work great but since the string values actually do  contain whitespace, I have to convert it to multivalue to get the desired outcome.  if I do the pre-processing steps, both of our regular expressions seem to get the job done 🙂  Thanks so much for your reply!

0 Karma
Get Updates on the Splunk Community!

Now Playing: Splunk Education Summer Learning Premieres

It’s premiere season, and Splunk Education is rolling out new releases you won’t want to miss. Whether you’re ...

The Visibility Gap: Hybrid Networks and IT Services

The most forward thinking enterprises among us see their network as much more than infrastructure – it's their ...

Get Operational Insights Quickly with Natural Language on the Splunk Platform

In today’s fast-paced digital world, turning data into actionable insights is essential for success. With ...