Splunk Search

How to extract multiple same name child attributes from XML data into their own unique fields?

bwheelock
Path Finder

I have some XML data broken down into events that have multiple child attributes that share the same name but are distinctly different fields. What I need to do, for each event, is to separate each unique child section into their own unique fields. This is probably extremely easy to accomplish, but I cannot seem to figure it out.

Referencing the sample data below, I need to extract the vendor's information as VendorName, VendorCity, etc and likewise for client and supplier I need ClientName, ClientCity, etc. Auto extraction obviously doesn't work in this case without transforms, and regular expressions are proving to be difficult because the real data has multiple addresses, phone numbers, and/or might lack some of this information. Each XML file (event) has about 100 lines.

I'm able to get the information easily if I use spath, but I don't think I can use spath for anything but search time extraction by piping it in with the output and path. That's fine for a search or report here or there, but otherwise I'm piping in about 50 lines.

What am I doing wrong?

<OrderForm>
  <ClientOrder PO="00000123">
    <Vendor ID="789">
      <Name>Paperclips, INC</Name>
      <Address>
        <Street>789 Paper St</Street>
        <City>San Francisco</City>
        <State>CA</State>
        <Zip>84989</Zip>
      </Address>
    </Vendor>
    <Supplier ID="224">
      <Name>Happy Paper Co.</Name>
      <Address>
        <Street>12455 Shipping Ave</Street>
        <City>Los Angeles</City>
        <State>CA</State>
        <Zip>92254</Zip>
      </Address>
    </Supplier>
    <Client ID="4152">
      <Name>Dunder Mifflin Infinity</Name>
      <Address>
        <Street>1725 Slough Ave</Street>
        <City>Scranton</City>
        <State>PA</State>
        <Zip>18503</Zip>
      </Address>
    </Client>
  </ClientOrder>
</OrderForm>
0 Karma

anhtran
New Member

How can you use Spath to get the VendorName, VendorCity, etc ? Thanks

0 Karma

bwheelock
Path Finder
index=test_orders sourcetype=orderForms
| spath output=VendorName path=OrderForm.ClientOrder.Vendor.Name
| spath output=VendorCity path=OrderForm.ClientOrder.Vendor.Address.City
| spath output=ClientName path=OrderForm.ClientOrder.Client.Name
| spath output=ClientCity path=OrderForm.ClientOrder.Client.Address.City

anhtran
New Member

ah, very neat , thank you !

0 Karma

miteshvohra
Contributor

If every XML file is a single event, you may try this props settings:

LINE_BREAKER = (?!)
SHOULD_LINEMERGE = false
#BREAK_ONLY_BEFORE = <OrderForm>
DATETIME_CONFIG = NONE
LEARN_MODEL = false
#MAX_EVENTS = 200000
TRUNCATE = 0 

Let us know what worked for you.

Mitesh.

0 Karma

bjoernjensen
Contributor

Hi,

maybe splitting the XML at the level of "Vendor", "Supplier", "Client" could help. Therefore use the BREAK_ONLY_BEFORE in your props.conf:
http://docs.splunk.com/Documentation/Splunk/6.0.2/Admin/propsconf

with something like this regex:
BREAK_ONLY_BEFORE = ^\s+\<Vendor ID="\d+"\>|^\s+\<Supplier ID="\d+"\>|^\s+\<Client ID="\d+"\>

You may still have to do "something" with the parts between two <OrderForm> Elements (e.g. NULL queue).

0 Karma

bwheelock
Path Finder

I'd lose the corresponding order ID though, in this case the PO#.

0 Karma

anhtran
New Member

Is this solution worked for you ? Thank you

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...