Splunk Search
Highlighted

Regex help - disregard everything after a match

Motivator

I have the following regex that is pulling the sender and receiver domains:

"SenderAddress":"\w+.*@(?<s_domain>.*)","RecipientAddress":"\w+.*@(?<r_domain>.*)

The issue I have is that there are some emails that don't have the 'Subject' field right after the receiver domain.

How can I write my regex that once it's done matching the receiver domain, it ignores everything after which will address when fields are missing?

Thx

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Influencer

@jwalzerpitt instead of .* use \w+

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Motivator

Thx

I tried \w+ (https://regex101.com/r/nInHIF/1) but still matches everything after blah.edu:

blah.edu","Status":"Delivered","Size":"0"}
0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Legend

Hi jwalzerpitt,
could you share two examples of your logs, one for each kind of log?
Bye.
Giuseppe

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Motivator

Thx for the reply.

Here is an email in its raw format with the Subject field present:

{"EventReceivedTime":"2019-06-26 09:52:21","SourceModuleName":"EXCHGETMESGTRACEPRD","SourceModuleType":"im_file","MessageId":"<eb63e665210e9449d5b386d1ae679faa@3e723b591bdb95ce8f5c9b7032dc572ca97351d0da5efc73459c1fbaf438e43b>","Received":"6/26/2019 9:39:47 AM","SenderAddress":"notification@facebookmail.com","RecipientAddress":"user@blah.edu","Subject":"See Who Liked Your Page","Status":"Delivered","FromIP":"69.171.232.138","Size":"56928"}

Here is an email in its raw format with the Subject field missing:

{"EventReceivedTime":"2019-06-26 09:47:53","SourceModuleName":"EXCHGETMESGTRACEPRD","SourceModuleType":"im_file","MessageId":"<0100016b93ffa54d-a8a96a78-94b0-46e6-aed6-c0e82ef6d228-000000@email.amazonses.com>","Received":"6/26/2019 9:35:33 AM","SenderAddress":"DoNotReply@ConnectedCommunity.org","RecipientAddress":"jfz5@blah.edu","Status":"Delivered","Size":"0"}

Thx

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Legend

Hi jwalzerpitt,
try something like this:

SenderAddress\":\"(?P<s_domain>[^\"]*).*RecipientAddress\":\"(?P<r_domain>[^\"]*)

You can test it at https://regex101.com/r/H3sxjR/1

Bye.
Giuseppe

Highlighted

Re: Regex help - disregard everything after a match

Motivator

Guiseppe,

Thx as that regex worked! Greatly appreciated

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Builder

While that regex works, it might not be the best practice for regex usage. Please review my answer above to see a better alternative for what you're trying to do with regex.

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Motivator

After further testing, I did apply the regex you recommended

Thx

0 Karma
Highlighted

Re: Regex help - disregard everything after a match

Builder

Hi @jwalzerpitt ,
You just need to anchor your regex properly.

Try the following:

"SenderAddress":"[^@]+@(?<sdomain>[^"]+)","RecipientAddress":"[^@]+@(?<rdomain>[^"]+)"

If you're using rex, you'll have to escape the double quotes:

| rex "\"SenderAddress\":\"[^@]+@(?<sdomain>[^\"]+)\",\"RecipientAddress\":\"[^@]+@(?<rdomain>[^\"]+)\""

View solution in original post