I have a situation where I'm using case to compare 2 fields to identify a fuzzy match, but in field 1 I may have "boa.com" and in field 2 I have "Bank Of America" what I want to do is to take the letters of field 1 and the first letter of each word in field 2 (understanding there is no potential maximum number of words the value may contain). I know I can usually do something with mvindex by using an index field of -1 to identify the "last value" of a multi value field, but I'm not sure how to try to marry that with case(like and substr(). Has anyone ever accomplished anything like this before?
I'm trying things like | rex field=Company "(?<CamelCase>\b(\w))" but its only returning "b" in CamelCase instead of "boa"
I was just about to come on here and post that I figured it out, but what I was doing isn't as elegant as what you did.
I did
| makemv CompanyName
| rex field=CompanyName "(?<CamelCase>\b(\w))"
| eval CamelCase=mvjoin(CamelCase,"")
| nomv CompanyName
| eval DomainMatchesCompany=case(like(lower(CompanyName),"%".substr(lower(domain_root),1,3)."%"),"Yes",
like(lower(CamelCase),"%".substr(lower(domain_root),1,3)."%"),"Yes", 1=1,"No")
I will try your Approach and see if I get something similar
Similar to this response, try something like this
| rex max_match=0 field=field2 "(?<initial>[a-zA-Z])[a-zA-Z]* ?"
| eval webdomain=lower(mvjoin(initial,"")).".com"
I was just about to come on here and post that I figured it out, but what I was doing isn't as elegant as what you did.
I did
| makemv CompanyName
| rex field=CompanyName "(?<CamelCase>\b(\w))"
| eval CamelCase=mvjoin(CamelCase,"")
| nomv CompanyName
| eval DomainMatchesCompany=case(like(lower(CompanyName),"%".substr(lower(domain_root),1,3)."%"),"Yes",
like(lower(CamelCase),"%".substr(lower(domain_root),1,3)."%"),"Yes", 1=1,"No")
I will try your Approach and see if I get something similar
So as it turns out with regard to my data, word boundaries and \w work great but since the string values actually do contain whitespace, I have to convert it to multivalue to get the desired outcome. if I do the pre-processing steps, both of our regular expressions seem to get the job done 🙂 Thanks so much for your reply!