Splunk Search

Microsoft O365 add-on: How to extract data from emails?

sanju2408de
Explorer

I am facing challenges while extracting the data from emails, using the Microsoft O365 email add on.

I want to extract the "Requested for" and "Finished" for which respective values are "ABC.ITGLOBAL@XYZ.com" and "Fri, Mar 11 2022 15:09:29 GMT+00:00".

I have tried Regex101 site and could successfully test a Regex pattern as below for matching the value for "Requested for" but the same pattern doesn't work in Splunk.

(?i) for\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\w+\-\w+:\w+\-\w+\"\>(?P<Requested_For>\S+)(?=\<\/td)

I need help here to sort this out, please if anyone can share their thoughts here.

Finished</td><td class="" style="vertical-align:top; padding:10px 4px; border-bottom:solid #eaeaea 1px; text-align:left; white-space:normal; width:99%; word-break:break-word">Fri, Mar 11 2022 15:09:29 GMT+00:00</td></tr><tr><td class="" style="vertical-align:top; padding:10px 4px; border-bottom:solid #eaeaea 1px; text-align:left; white-space:nowrap; font-weight:600; min-width:130px">Requested for</td><td class="" style="vertical-align:top; padding:10px 4px; border-bottom:solid #eaeaea 1px; text-align:left; white-space:normal; width:99%; word-break:break-word">ABC.ITGLOBAL@XYZ.com</td></tr><tr><td class=""

 

Labels (1)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Try this regex.  It tries to avoid depending on the number of words between groups.

Finished\<\/td>\<[^>]+>(?<Finished>[^\<]+).*?Requested for\<\/td>\<[^\<]+>(?<Requested_For>[^\<]+)

Also, the ?i flag most likely is not needed since the keywords in the data ("Finished" and "Requested for") probably will always be the same. 

---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this regex.  It tries to avoid depending on the number of words between groups.

Finished\<\/td>\<[^>]+>(?<Finished>[^\<]+).*?Requested for\<\/td>\<[^\<]+>(?<Requested_For>[^\<]+)

Also, the ?i flag most likely is not needed since the keywords in the data ("Finished" and "Requested for") probably will always be the same. 

---
If this reply helps you, Karma would be appreciated.
0 Karma

sanju2408de
Explorer

@richgalloway Thanks so much for your help, this actually worked.

We had few more fields to extract from the same email and i used the same regex patterns as you have provided. It perfectly worked.

 

Many Thanks again for your help.

0 Karma
Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...