Splunk Search

Multivalue XML extraction not working

responsys_cm
Builder

I'm trying to add several lines of XML to a multi-valued field. The data looks like:

<EXPLT>

<REF><![CDATA[CVE-2011-4885]]></REF>

<DESC><![CDATA[PHP Hashtables Denial of Service - The Exploit-DB Ref : 18296]]></DESC>

<LINK><![CDATA[http://www.exploit-db.com/exploits/18296]]&gt;&lt;/LINK>

</EXPLT>

<EXPLT>

<REF><![CDATA[CVE-2011-4885]]></REF>

<DESC><![CDATA[PHP Hash Table Collision Proof Of Concept - The Exploit-DB Ref : 18305]]></DESC>

<LINK><![CDATA[http://www.exploit-db.com/exploits/18305]]&gt;&lt;/LINK>

</EXPLT>

<EXPLT>

<REF><![CDATA[CVE-2011-4153]]></REF>

<DESC><![CDATA[PHP 5.3.8 Multiple Vulnerabilities - The Exploit-DB Ref : 18370]]></DESC>

<LINK><![CDATA[http://www.exploit-db.com/exploits/18370]]&gt;&lt;/LINK>

</EXPLT>

<EXPLT>

<REF><![CDATA[CVE-2011-4885]]></REF>

<DESC><![CDATA[MyBulletinBoard (MyBB) <= 1.1.5 (CLIENT-IP) SQL Injection Exploit - The Exploit-DB Ref : 2012]]></DESC>

<LINK><![CDATA[http://www.exploit-db.com/exploits/2012]]&gt;&lt;/LINK>

</EXPLT>

<EXPLT>

<REF><![CDATA[CVE-2012-0781]]></REF>

<DESC><![CDATA[PHP 5.3.8 Multiple Vulnerabilities - The Exploit-DB Ref : 18370]]></DESC>

<LINK><![CDATA[http://www.exploit-db.com/exploits/18370]]&gt;&lt;/LINK>

</EXPLT>

My transforms.conf looks like:

[qualys_exploit]

REGEX = (?mis)(&lt;EXPLT&gt;.*&lt;/EXPLT&gt;)

FORMAT = qualys_exploit::$1

MV_ADD = true

props.conf:

REPORT-qualys_exploit = qualys_exploit

Splunk is taking everything between the first opening EXPLT tag and last closing EXPLT tag and making it a single event. What am I doing wrong that it's not treating these as multiple individual events?

Thx.

C

Tags (1)
0 Karma
1 Solution

andreas
Explorer

The quantifier * in the REGEX is greedy, so the expression . * is eating up all the chars before the last </EXPLT>
Try adding a ? after the * to make it non-greedy, so the regex "stops" at the next </EXPLT>, not the last.

REGEX = (?mis)(<EXPLT>.*?</EXPLT>)

View solution in original post

andreas
Explorer

The quantifier * in the REGEX is greedy, so the expression . * is eating up all the chars before the last </EXPLT>
Try adding a ? after the * to make it non-greedy, so the regex "stops" at the next </EXPLT>, not the last.

REGEX = (?mis)(<EXPLT>.*?</EXPLT>)

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Event Series: Splunk Observability Metrics Cost Optimization

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...