Splunk Search

Field extraction from html email

manderson7
Contributor

I've been banging my head against the wall trying to get this to work, and not succeeding, obviously. I have a 217 line email from my power company I've ingested using the imap app, and I want to get a fields out of it. The built in field extractor will only show the first 15 lines of the email, so that's not helpful, and the regex's I've cobbled together aren't pulling out the data. The data that I want is:

 font face="Arial, Helvetica, sans-serif" size="2" color="" style="font-size: 13px">43 kWh </font

and I want the "43 kWh" from that line, and to name the field "previousday_energy_use".
and the next line I want is:

 font face="Arial, Helvetica, sans-serif" size="2" color="#3e3e3e" style="font-size: 13px">$5</font>

and I want the "$5" data.

I'd appreciate any help here.
If I do a

rex field=_raw "(?<kWh>13px\">.*?<)"   

the field created just shows

13px"><
0 Karma
1 Solution

cpetterborg
SplunkTrust
SplunkTrust

Does this work for you?:

... | rex "13px\">(?P<previousday_energy_use>\d+\skWh)" | rex "13px\">(?P<cost>\$\d+)"

View solution in original post

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Does this work for you?:

... | rex "13px\">(?P<previousday_energy_use>\d+\skWh)" | rex "13px\">(?P<cost>\$\d+)"
0 Karma

manderson7
Contributor

Yes thank you! That's it exactly. Forgot my slashes apparently.

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...