I'm ingesting XML DMARC reports into Splunk, but the individual events aren't very useful without including things like begin_date, end_date, org_name, email and report_id in each event. Those values only exist in a "metadata" section at the top of the report. How can I take those values that only occur once in the report and include them in each event?
Here's a sample of the XML data I'm ingesting:
<?xml version="1.0" encoding="UTF-8" ?>
<feedback>
<report_metadata>
<org_name>emailsrvr.com</org_name>
<email>dmarc_reports@emailsrvr.com</email>
<extra_contact_info>http://emailsrvr.com</extra_contact_info>
<report_id>1a292ea2-d440-4985-a969-839778bceac1</report_id>
<date_range>
<begin>1487289600</begin>
<end>1487376000</end>
</date_range>
</report_metadata>
<policy_published>
<domain>mycompany.com</domain>
<adkim>r</adkim>
<aspf>r</aspf>
<p>none</p>
<sp>none</sp>
<pct>100</pct>
</policy_published>
<record>
<row>
<source_ip>192.168.x.x</source_ip>
<count>1</count>
<policy_evaluated>
<disposition>none</disposition>
<dkim>fail</dkim>
<spf>pass</spf>
</policy_evaluated>
</row>
<identifiers>
<header_from>mycompany.com</header_from>
</identifiers>
<auth_results>
<spf>
<domain>mycompany.com</domain>
<result>pass</result>
</spf>
</auth_results>
</record>
<record>
<row>
<source_ip>192.168.x.x</source_ip>
<count>1</count>
<policy_evaluated>
<disposition>none</disposition>
<dkim>pass</dkim>
<spf>fail</spf>
</policy_evaluated>
</row>
<identifiers>
<header_from>mycompany.com</header_from>
</identifiers>
<auth_results>
<spf>
<domain>mycompany.com</domain>
<result>temperror</result>
</spf>
</auth_results>
</record>
I haven't been able to figure out how to pull the date and org_name fields out of the report_metadata section and put them into into each individual event. So ideally, I'd like my events to look something like:
report_id=1a292ea2-d440-4985-a969-839778bceac1, date_begin=1487289600, date_end=1487376000, org_name=emailsrvr.com, source_ip=192.168.x.x, disposition=none, dkim=fail, spf=pass, header_from=mycompany.com
report_id=1a292ea2-d440-4985-a969-839778bceac1, date_begin=1487289600, date_end=1487376000, org_name=emailsrvr.com, source_ip=192.168.x.x, disposition=none, dkim=pass, spf=fail, header_from=mycompany.com
Some questions:
1. Should I ingest the whole XML file as a single event and then do the processing at search time, or should I use XML_KV and break events on the tag?
2. How do I parse out the values in the report_metadata field and apply them to each event?
Here's my current sourcetype configuration for this data source:
[dmarc_xml]
DATETIME_CONFIG =
LINE_BREAKER = (<record>)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %s
TIME_PREFIX = <end>
category = Email
description = DMARC XML Reports
disabled = false
pulldown_type = true
BREAK_ONLY_BEFORE = (<record>)
KV_MODE = xml
Thanks in advance for your help!
... View more