Splunk Search

How to extract the email address from the my logs at either search or index-time?

smudge797
Path Finder

I need to extract the email address from the following logs, either in a search or via props.conf - transforms.conf Any help much appreciated

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To:
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From:
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To:
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To:
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: New SMTP ICID 57360165 interface InternalNet (10.123.78.14) address 10.123.245.103 reverse dns host unknown verified no

Thanks!

smudge797
Path Finder

The emails inside the < > were removed from my post, So removing them shows all the emails, ideally I need the To: From: as searchable fields and discoverable in the discovered fields panel:

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To: me.rongan@gmail.com
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From: susmith@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To: oliver.whatmail@orange-blah.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To: bnagpalxxx@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from prvs=341af728e=admin@whatthedom.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: New SMTP ICID 57360165 interface InternalNet (10.123.78.14) address 10.123.245.103 reverse dns host unknown verified no,Looks like my sample text had the emails removed that were inside the < > by removing the < > you can see the emails. Is there a way to have the To & From as searchable fields?

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To: blah@gah.com
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From: susmith@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '<54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To: oliver.whatmail@orange-blah.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To: bnagpalxxx@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from prvs=341af728e=admin@whatthedom.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: New SMTP ICID 57360165 interface InternalNet (10.123.78.14) address 10.123.245.103 reverse dns host unknown verified no

0 Karma

kristian_kolb
Ultra Champion

You can use the 'code sample' formatting option when you have 'special' characters in your posts. It's the little button with "101010". Also, I updated my answer above.

/k

0 Karma

kristian_kolb
Ultra Champion

With rex (searchtime) it's as easy as;

your search |  rex "<(?<myemail>[^>]+)" |  blah blah 

In props.conf (also searchtime)

[your sourcetype]
EXTRACT-blah = <(?<myemail>[^>]+)

Don't try to do it at index-time. That is not what you want.


UPDATE:

To extract email addresses into field names based on their context inside an event, you might want to try something like;

props.conf

[your sourcetype]
EXTRACT-to = \sTo:\s(?<to_addr>\S+)
EXTRACT-from = \From:\s(?<from_addr>\S+)

This should work for the to/from email addresses of the formats below.

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To: me.rongan@gmail.com
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From: susmith@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '<54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com>'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To: oliver.whatmail@orange-blah.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To: bnagpalxxx@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from prvs=341af728e=admin@whatthedom.com

If you also want to extract the other addresses, you could add the following;

EXTRACT-other = [=<](?<other_addr>[a-zA-Z0-9-.]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.])

Should work

/k

smudge797
Path Finder

In props.conf
Using EXTRACT-to = \sTo:\s(?\S+) is working well 🙂
Using EXTRACT-from = \From:\s(?\S+) is showing as | from_addr=<> when searching?

Some are working but majority are showing bank.

0 Karma

ulrich_track
Path Finder

I would use this regex:
<([0-9A-Za-z.]+)(@)([0-9A-Za-z.]+)>

Tested it with regex101.com

0 Karma

ulrich_track
Path Finder

True - I forgot - in Splunk you do not search for the string, you search for what is before and after it.

0 Karma

kristian_kolb
Ultra Champion

That extracts three separate strings, and it does not put them into a field.

markthompson
Builder

You might be able to use the transaction and use transaction startswith="<" endswith=">".

0 Karma

kristian_kolb
Ultra Champion

No, it does not work that way. A transaction is a method for grouping separate events together, based on some characteristic, such as a common field value.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...