Getting Data In

XML tag extraction

OMohi
Path Finder

I have a datasource that reads in events in XML format. Could someone please help me build a props.conf that will extract all fields and show the events in treeview. Sample event below:

Fri Aug 07 13:42:37 EDT 2015 name="QUEUE_msg_received" event_id="ID:414d51204d514942513032202020202055bdd7d620016441" msg_dest="QA.EA.ELOG.BUSINESSEVENT1" msg_body="<?xml version="1.0" encoding="UTF-8"?><v1:BusinessEventRequest xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:v1="http://schemas.humana.com/Infrastructure/Utility/Logging/BusinessEventRequest/V1.1"&gt;&lt;v1:Busine... xmlns:mstns="http://enrollmentservices.humana.com/Schema/BAMSchema/v1.0"&gt;&lt;mstns:EventSource&gt;FileIntake&l... upload requested</mstns:MilestoneEvent><mstns:MilestoneState>Begin</mstns:MilestoneState><mstns:DataElements><mstns:FileName/><mstns:FileSize>9008</mstns:FileSize><mstns:AdditionalInfo>File upload requested</mstns:AdditionalInfo></mstns:DataElements></mstns:Milestone></mstns:BAMEvent></EventInformation></v1:BusinessProcessInformation></v1:BusinessEvent></v1:BusinessEventRequest>"

Tags (2)
0 Karma
1 Solution

OMohi
Path Finder

The problem is that I tried using KV_MODE = xml but the data contains some non xml fields hence the extraction doesn't work. I found a solution, that is defining in props.conf:

[sourcetype]
Report-xmlkv = xmlkv -alternative

In transforms.conf

[xmlkv-alternative]
REGEX = <([^\s>])[^>]>([^<]*)<\/\1>
FORMAT = $1::$2

This works and I was able to successfully extract all the XML tags as a field.

We can also | xmlkv for search time extraction but the client wanted the business users to understand the data in simplistic fashion.

View solution in original post

ejenson
Explorer

This was very helpful for my situation where there is a mix of xml and non xml.
I had to tweak my regex in transforms.

0 Karma

OMohi
Path Finder

The problem is that I tried using KV_MODE = xml but the data contains some non xml fields hence the extraction doesn't work. I found a solution, that is defining in props.conf:

[sourcetype]
Report-xmlkv = xmlkv -alternative

In transforms.conf

[xmlkv-alternative]
REGEX = <([^\s>])[^>]>([^<]*)<\/\1>
FORMAT = $1::$2

This works and I was able to successfully extract all the XML tags as a field.

We can also | xmlkv for search time extraction but the client wanted the business users to understand the data in simplistic fashion.

sduff_splunk
Splunk Employee
Splunk Employee

In your props.conf, you should be able to use KV_MODE = xml to extract xml data

You could use the spath command in search to extract fields at search time
http://docs.splunk.com/Documentation/Splunk/6.2.4/SearchReference/spath

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...