Hi all, I have some raw data looking like this.(just a part)
....."","10/30/2018 7:31:08 AM","10/30/2018 7:41:52 AM","natalie.someone@email.com","andrew.someone@email.com","UCCAPI/3823.323.10827.* OC/16.0.10827.20150 (Skype for Business)","UCCAPI/16.0.10730.2342342 OC/16.0.10730.20088 (Skype for Business)","","","","****-5042-5F76-A879-***7","","","","","200","[IM]","{""RequestType"":""BYE"",""RequestTime"":""2018-10-30T07:41:52.2147589"",""ContentType"":"""",""ResponseCode"":""200"",..
I want to extract two email addresses from each raw event ( natalie.someone@email.com
, andrew.someone@email.com
in this case) to be my two fields caller_email and receiver_email.
Does anyone know to do this? Thanks a lot!
Try this regex: \"(?<caller>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\",\"(?<receiver>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\"
https://regex101.com/r/wsaYMy/1/
But might be worth investing some time in defining a proper delims based extraction for the entire event.
I have a very similar question although I could have one, two or three email addresses in the raw data. If I use the answer below I can get results if there are exactly two email addresses .... if I modify with another duplicate regex I can get results if there are exactly three email addresses. Is there a way to get results no matter how many email addresses appear in raw data?
Hi willymac650,
try this regular expression:
(("[a-zA-Z0-9_-.]+@[a-zA-Z0-9_-.]+")+),
the double brackets tells it to repeat the pattern matching, I am no expert, I just googled :"find repeat patterns in regex" and one of the pages explained this, I have tried it on regex101.com with your data and I am able to match multiple times.
The only problem now is that I don't know how to name each match using $
Hope this helps
Blaise
Try this regex: \"(?<caller>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\",\"(?<receiver>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\"
https://regex101.com/r/wsaYMy/1/
But might be worth investing some time in defining a proper delims based extraction for the entire event.
This worked perfectly, thanks a lot! also suggestion noted.
I have tried it on regex101.com and I think this will help you:
\s+[.]{5}"",".+?",".+?",(?".+?"),(?".+?"),
it extracts both emails and creates two fields called "email1" and "email2" to contain the result of the match.
\s+ one or more space
[.]{5} 5 dots
"", 2 double quotes characters, followed by a coma
".+?" 2 double quotes with anything inside, the ? is to make the match small (greedy?)
, a coma
".+?", same as above again
(?".+?") same as above but this time it has parentheses around, so that says that it needs to be saved, by default it would be saved into $1, but the ? part is actually naming the variable into which the matching part will be saved
, a coma
(?".+?") same as above but this time the variable is called email2
, a coma
Hope this helps
Blaise
He is only showing a fragment of his log, so \s+[.]{5}
is not what it actually shows at the start of his data. That's why for my answer I just created a regex that looks for 2 consecutive valid email addresses.