Splunk Search

Field extraction from html email

manderson7
Contributor

I've been banging my head against the wall trying to get this to work, and not succeeding, obviously. I have a 217 line email from my power company I've ingested using the imap app, and I want to get a fields out of it. The built in field extractor will only show the first 15 lines of the email, so that's not helpful, and the regex's I've cobbled together aren't pulling out the data. The data that I want is:

 font face="Arial, Helvetica, sans-serif" size="2" color="" style="font-size: 13px">43 kWh </font

and I want the "43 kWh" from that line, and to name the field "previousday_energy_use".
and the next line I want is:

 font face="Arial, Helvetica, sans-serif" size="2" color="#3e3e3e" style="font-size: 13px">$5</font>

and I want the "$5" data.

I'd appreciate any help here.
If I do a

rex field=_raw "(?<kWh>13px\">.*?<)"   

the field created just shows

13px"><
0 Karma
1 Solution

cpetterborg
SplunkTrust
SplunkTrust

Does this work for you?:

... | rex "13px\">(?P<previousday_energy_use>\d+\skWh)" | rex "13px\">(?P<cost>\$\d+)"

View solution in original post

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Does this work for you?:

... | rex "13px\">(?P<previousday_energy_use>\d+\skWh)" | rex "13px\">(?P<cost>\$\d+)"
0 Karma

manderson7
Contributor

Yes thank you! That's it exactly. Forgot my slashes apparently.

0 Karma
Get Updates on the Splunk Community!

Unlock Database Monitoring with Splunk Observability Cloud

In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and stall ...

Print, Leak, Repeat: UEBA Insider Threats You Can't Ignore

Are you ready to uncover the threats hiding in plain sight? Join us for "Print, Leak, Repeat: UEBA Insider ...

Splunk MCP & Agentic AI: Machine Data Without Limits

  Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization ...