Solved: Fuzzy Logic match with multi word value

mjones414 · ‎11-01-2023

I have a situation where I'm using case to compare 2 fields to identify a fuzzy match, but in field 1 I may have "boa.com" and in field 2 I have "Bank Of America" what I want to do is to take the letters of field 1 and the first letter of each word in field 2 (understanding there is no potential maximum number of words the value may contain). I know I can usually do something with mvindex by using an index field of -1 to identify the "last value" of a multi value field, but I'm not sure how to try to marry that with case(like and substr(). Has anyone ever accomplished anything like this before?

I'm trying things like | rex field=Company "(?<CamelCase>\b(\w))" but its only returning "b" in CamelCase instead of "boa"

mjones414 · ‎11-01-2023

I was just about to come on here and post that I figured it out, but what I was doing isn't as elegant as what you did.

I did

| makemv CompanyName
| rex field=CompanyName "(?<CamelCase>\b(\w))"
| eval CamelCase=mvjoin(CamelCase,"")
| nomv CompanyName
| eval DomainMatchesCompany=case(like(lower(CompanyName),"%".substr(lower(domain_root),1,3)."%"),"Yes",
like(lower(CamelCase),"%".substr(lower(domain_root),1,3)."%"),"Yes", 1=1,"No")

I will try your Approach and see if I get something similar

View solution in original post

ITWhisperer · ‎11-01-2023

Similar to this response, try something like this

| rex max_match=0 field=field2 "(?<initial>[a-zA-Z])[a-zA-Z]* ?"
| eval webdomain=lower(mvjoin(initial,"")).".com"

mjones414 · ‎11-01-2023

I was just about to come on here and post that I figured it out, but what I was doing isn't as elegant as what you did.

I did

| makemv CompanyName
| rex field=CompanyName "(?<CamelCase>\b(\w))"
| eval CamelCase=mvjoin(CamelCase,"")
| nomv CompanyName
| eval DomainMatchesCompany=case(like(lower(CompanyName),"%".substr(lower(domain_root),1,3)."%"),"Yes",
like(lower(CamelCase),"%".substr(lower(domain_root),1,3)."%"),"Yes", 1=1,"No")

I will try your Approach and see if I get something similar

mjones414 · ‎11-01-2023

So as it turns out with regard to my data, word boundaries and \w work great but since the string values actually do contain whitespace, I have to convert it to multivalue to get the desired outcome. if I do the pre-processing steps, both of our regular expressions seem to get the job done 🙂 Thanks so much for your reply!

Fuzzy Logic match with multi word value

AppDynamics Summer Webinars

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Are you a member of the Splunk Community?

Fuzzy Logic match with multi word value

AppDynamics Summer Webinars

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor