Splunk Search

Dynamic Tag values etraction using mvzip

ipsitam
New Member

Hi,

I am struggling with xml data in splunk and need help in mvzip /mvexpand command to store multi value pairs with different cardinality and expand them as value pair correctly.

I have a xml which looks as below .

  <xml>
  <Order>1-23445565 </Order>
  <SubOrder></SubOrder>
  <Product>Phone</Product>
  <AppointmentID>1234</AppointmentID>
  <SubOrder></SubOrder>
  <Product>Mobile</Product>
  <AppointmentID>12345</AppointmentID>
  <SubOrder></SubOrder>
  <Product>TV</Product>
  <SubOrder></SubOrder>
  <Product>Internet</Product>
  <AppointmentID>123456</AppointmentID>
  <xml>

I am trying to extract all the multivalue pair values ( Product, AppointmentID) and storing them using mvzip command, but when there is a null value it does not handle it well. I also tried with fillnull command which does not seem to work pretty well here .

My search looks like this at this moment to extract the key value pairs.

 index="*" 
  |rex "<Order>(?P<OrderId>[^\<]+)"
  |rex max_match=0 "<Product>(?P<tmpProduct>[^\<]+)"
  |rex max_match=0 "<AppointmentID>(?P<tmpAppointmentID>[^\<]+)"
  |eval field1=mvzip(tmpProduct,tmpAppointmentID)
  | mvexpand field1
  | rex field=field1 "(?<ProductName>[^,]+),(?<AppointmentID>[^,]+)"
  | table OrderId,field1, ProductName,AppointmentID

In output I am getting below results which is incorrect.

  OrderId    field1    ProductName    AppointmentID
  1-23445565    Phone,1234    Phone    1234
  1-23445565    Mobile,12345    Mobile    12345
  1-23445565    TV,123456    TV    123456

I am missing one record with Product called "Internet" and the AppointmentID is wrongly getting appended to Product "TV" as the same product does not have any AppointmentID in the raw xml data.

Ideally My ouput should look like as below.

  OrderId    field1    ProductName    AppointmentID
  1-23445565    Phone,1234    Phone    1234
  1-23445565    Mobile,12345    Mobile    12345
  1-23445565    TV,"N/A"        TV   "N/A"
  1-23445565    Internet,        123456

Any help here is much appreciated.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Your XML is terrible. Ideally, each product-appointment object would be contained in its own element.

That being said, the rex approach cannot work. After extracting those two lists of values, there's no way to tell which value from one list is missing a value from the second - so there's no way to fill in that value later.

What you would need to do is split the _raw text before extracting (untested, but the general approach should work):

  index="*" |rex "<Order>(?P<OrderId>[^\<]+)"
| rex max_match=0 "(?s)(?<object><SubOrder.*?)(?=\s*<(SubOrder|/?xml))"
| mvexpand object
| rex field=object ...

Then extract product and appointment from the split-up object. This way the link between the fields is retained, and you can now tell what missing field belongs to what object.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...