Splunk Search

How do I extract an email address from raw data using regex?

dannili
Communicator

Hi all, I have some raw data looking like this.(just a part)

....."","10/30/2018 7:31:08 AM","10/30/2018 7:41:52 AM","natalie.someone@email.com","andrew.someone@email.com","UCCAPI/3823.323.10827.* OC/16.0.10827.20150 (Skype for Business)","UCCAPI/16.0.10730.2342342 OC/16.0.10730.20088 (Skype for Business)","","","","****-5042-5F76-A879-***7","","","","","200","[IM]","{""RequestType"":""BYE"",""RequestTime"":""2018-10-30T07:41:52.2147589"",""ContentType"":"""",""ResponseCode"":""200"",..

I want to extract two email addresses from each raw event ( natalie.someone@email.com , andrew.someone@email.com in this case) to be my two fields caller_email and receiver_email.

Does anyone know to do this? Thanks a lot!

0 Karma
1 Solution

FrankVl
Ultra Champion

Try this regex: \"(?<caller>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\",\"(?<receiver>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\"
https://regex101.com/r/wsaYMy/1/

But might be worth investing some time in defining a proper delims based extraction for the entire event.

View solution in original post

0 Karma

willymac650
New Member

I have a very similar question although I could have one, two or three email addresses in the raw data. If I use the answer below I can get results if there are exactly two email addresses .... if I modify with another duplicate regex I can get results if there are exactly three email addresses. Is there a way to get results no matter how many email addresses appear in raw data?

0 Karma

blaise
Explorer

Hi willymac650,
try this regular expression:
(("[a-zA-Z0-9_-.]+@[a-zA-Z0-9_-.]+")+),
the double brackets tells it to repeat the pattern matching, I am no expert, I just googled :"find repeat patterns in regex" and one of the pages explained this, I have tried it on regex101.com with your data and I am able to match multiple times.
The only problem now is that I don't know how to name each match using $
Hope this helps
Blaise

0 Karma

FrankVl
Ultra Champion

Try this regex: \"(?<caller>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\",\"(?<receiver>[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5})\"
https://regex101.com/r/wsaYMy/1/

But might be worth investing some time in defining a proper delims based extraction for the entire event.

0 Karma

dannili
Communicator

This worked perfectly, thanks a lot! also suggestion noted.

0 Karma

blaise
Explorer

I have tried it on regex101.com and I think this will help you:

\s+[.]{5}"",".+?",".+?",(?".+?"),(?".+?"),

it extracts both emails and creates two fields called "email1" and "email2" to contain the result of the match.

\s+ one or more space
[.]{5} 5 dots
"", 2 double quotes characters, followed by a coma
".+?" 2 double quotes with anything inside, the ? is to make the match small (greedy?)
, a coma
".+?", same as above again
(?".+?") same as above but this time it has parentheses around, so that says that it needs to be saved, by default it would be saved into $1, but the ? part is actually naming the variable into which the matching part will be saved
, a coma
(?".+?") same as above but this time the variable is called email2
, a coma

Hope this helps
Blaise

0 Karma

FrankVl
Ultra Champion

He is only showing a fragment of his log, so \s+[.]{5} is not what it actually shows at the start of his data. That's why for my answer I just created a regex that looks for 2 consecutive valid email addresses.

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...