Splunk Search

Regex help - disregard everything after a match

jwalzerpitt
Influencer

I have the following regex that is pulling the sender and receiver domains:

"SenderAddress":"\w+.*@(?<s_domain>.*)","RecipientAddress":"\w+.*@(?<r_domain>.*)

The issue I have is that there are some emails that don't have the 'Subject' field right after the receiver domain.

How can I write my regex that once it's done matching the receiver domain, it ignores everything after which will address when fields are missing?

Thx

0 Karma
1 Solution

jnudell_2
Builder

Hi @jwalzerpitt ,
You just need to anchor your regex properly.

Try the following:

"SenderAddress":"[^@]+@(?<s_domain>[^"]+)","RecipientAddress":"[^@]+@(?<r_domain>[^"]+)"

If you're using rex, you'll have to escape the double quotes:

| rex "\"SenderAddress\":\"[^@]+@(?<s_domain>[^\"]+)\",\"RecipientAddress\":\"[^@]+@(?<r_domain>[^\"]+)\""

View solution in original post

jwalzerpitt
Influencer

Thx for the reply and the regex/rex and they both worked as well. Much appreciated

0 Karma

jwalzerpitt
Influencer

Please convert it to an answer when you get a chance

0 Karma

jnudell_2
Builder

Hi @jwalzerpitt ,
You just need to anchor your regex properly.

Try the following:

"SenderAddress":"[^@]+@(?<s_domain>[^"]+)","RecipientAddress":"[^@]+@(?<r_domain>[^"]+)"

If you're using rex, you'll have to escape the double quotes:

| rex "\"SenderAddress\":\"[^@]+@(?<s_domain>[^\"]+)\",\"RecipientAddress\":\"[^@]+@(?<r_domain>[^\"]+)\""

gcusello
SplunkTrust
SplunkTrust

Hi jwalzerpitt,
could you share two examples of your logs, one for each kind of log?
Bye.
Giuseppe

0 Karma

jwalzerpitt
Influencer

Thx for the reply.

Here is an email in its raw format with the Subject field present:

{"EventReceivedTime":"2019-06-26 09:52:21","SourceModuleName":"EXCHGETMESGTRACEPRD","SourceModuleType":"im_file","MessageId":"<eb63e665210e9449d5b386d1ae679faa@3e723b591bdb95ce8f5c9b7032dc572ca97351d0da5efc73459c1fbaf438e43b>","Received":"6/26/2019 9:39:47 AM","SenderAddress":"notification@facebookmail.com","RecipientAddress":"user@blah.edu","Subject":"See Who Liked Your Page","Status":"Delivered","FromIP":"69.171.232.138","Size":"56928"}

Here is an email in its raw format with the Subject field missing:

{"EventReceivedTime":"2019-06-26 09:47:53","SourceModuleName":"EXCHGETMESGTRACEPRD","SourceModuleType":"im_file","MessageId":"<0100016b93ffa54d-a8a96a78-94b0-46e6-aed6-c0e82ef6d228-000000@email.amazonses.com>","Received":"6/26/2019 9:35:33 AM","SenderAddress":"DoNotReply@ConnectedCommunity.org","RecipientAddress":"jfz5@blah.edu","Status":"Delivered","Size":"0"}

Thx

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi jwalzerpitt,
try something like this:

SenderAddress\":\"(?P<s_domain>[^\"]*).*RecipientAddress\":\"(?P<r_domain>[^\"]*)

You can test it at https://regex101.com/r/H3sxjR/1

Bye.
Giuseppe

jwalzerpitt
Influencer

Guiseppe,

Thx as that regex worked! Greatly appreciated

0 Karma

jnudell_2
Builder

While that regex works, it might not be the best practice for regex usage. Please review my answer above to see a better alternative for what you're trying to do with regex.

0 Karma

jwalzerpitt
Influencer

After further testing, I did apply the regex you recommended

Thx

0 Karma

Vijeta
Influencer

@jwalzerpitt instead of .* use \w+

0 Karma

jwalzerpitt
Influencer

Thx

I tried \w+ (https://regex101.com/r/nInHIF/1) but still matches everything after blah.edu:

blah.edu","Status":"Delivered","Size":"0"}
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...