Hello,
I am trying to extract and normalize some phone numbers that are appearing in inconsistent ways. Below I attempted to recreate a realistic example of what my data looks like. It contains multi values, special characters and numbers of varying lengths. I would prefer to do this at search time in my props.conf / transforms.
Ideally I'd like to use something similar to a transforms statement that says, start at a quotation mark, read all digits, stop at the next quotation mark.
I had considered doing this the with the following config but it appears to not be able to handle multivalued fields. Could I please get some suggestions on how to correct my config or a more efficient way to go about this?
In props.conf:
EXTRACT-my_stanza
EVAL-clean_numbers = replace(phone_number, "\D", "")
In transforms.conf:
[my_stanza]
SOURCE_KEY =
REGEX = \"(?\d+[^\"])
MV_ADD = true
Examples:
Log 1:
"(223) 456-0001"
Log 2:
"223-456 0002","(223)456-0003 1234"
"223-456 0101","223-456-0102"
Log 3:
"223-456-0004"
Log 4:
"234560005","(223)4560006","223-456-0007"
Log 5:
"1223456-0008"
Desired results:
Log 1:
1234560001
Log 2:
1234560002
1234560003
Log 3:
1234560004
Log 4:
1234560005
1234560006
1234560007
Log 5:
1234560008
... View more