Splunk Search
Highlighted

How to extract multiple values from XML logs and display all events where FieldA is not equal to FieldB?

Engager

I have some XML responses logged in Splunk which is pretty nested. Let's say there are multiple records of the form.

<records>
      <record>
        <Full Name>Ms. Brown Grimes</Full Name>
        <Country>Dronning Maud Land</Country>
        <NotificationEmail>Sam.Lemke@mckenzie.info</NotificationEmail>
        <Created At>Fri Aug 25 1989 22:17:00 GMT-0700 (Pacific Daylight Time)</Created At>
        <Id>10</Id>
        <Email>Sam.Lemke@mckenzie.info</Email>
      </record>
      <record>
        <Full Name>Irma Ledner I</Full Name>
        <Country>Vatican City</Country>
        <NotificationEmail>GabrielleGmail@gmail.com</NotificationEmail>
        <Created At>Tue Nov 30 1993 08:16:58 GMT-0800 (Pacific Standard Time)</Created At>
        <Id>12</Id>
        <Email>Gabrielle@myrl.biz</Email>
      </record>
    </records>

Now I want to find all records where NotificationEmail is not equal to Email.

What I was trying was piping to regex extractor.

rex "<record.*NotificationEmail>(?<nemail>\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b)<.*Email>(?<email>\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b)<"

where \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b is the regex to match email.

Highlighted

Re: How to extract multiple values from XML logs and display all events where FieldA is not equal to FieldB?

SplunkTrust
SplunkTrust

You want to filter the whole response (records set) where any of the record has NotificationEmail is equal to Email OR filter the record, within a response (record set) which has NotificationEmail is equal to Email?

0 Karma
Highlighted

Re: How to extract multiple values from XML logs and display all events where FieldA is not equal to FieldB?

Legend

The problem is that you need to extract multiple copies of the fields - assuming that the event is defined by the "\" tag.
Within the event, you have multiple values. There are a couple of ways to deal with this, but one would be

yoursearchhere
| rex maxmatch=0 "\<record\>(?<record>.*?)\</record\>"
| mvexpand record
|rex "<record.*NotificationEmail>(?<nemail>\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b)<.*Email>(?<email>\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b)<"
| where nemail!=email

The first rex and mvexpand break the original event into multiple events, one for each "record." After that, the original rex is applied and the comparison is made. I didn't verify that the regular expression is correct. Personally, I would have done something much more simple:

| rex "\<NotificationEmail\>(?<nemail>.*?)\</NotificationEmail\>.*?\<Email\>(?<email>.*?)\</Email\>"
Highlighted

Re: How to extract multiple values from XML logs and display all events where FieldA is not equal to FieldB?

SplunkTrust
SplunkTrust

Parsing XML with regex is a painful process, especially considering Splunk has commands tailored specifically for this.

Note, your example is not valid XML - elements should not contain spaces in their names. Once that's fixed, you can run this:

 search for your events | spath records.record | mvexpand records.record | spath input=records.record | where NOT Email=NotificationEmail

That will extract each record into its own event, parse the elements of the record, and filter according to the email fields.

Highlighted

Re: How to extract multiple values from XML logs and display all events where FieldA is not equal to FieldB?

Path Finder

You can let Splunk extract all the XML fields automatically by changing the props.conf file in the application of interested (say search).

Here is a stanza example:

[my_xml_logs_source_type]
KV_MODE = xml
...
0 Karma