Splunk Search

parsing XML data with hierarchy of KV pairs

halr9000
Motivator

Banging my head on this one for too long, could use some help.

Take a sample doc such as the below, where you have a hierarchy of KV pairs:

<root>
  <ItemSets>
    <ItemCollection>
      <Name>name1</Name>
      <Value>value1</Value>
    </ItemCollection>
    <ItemCollection>
      <Name>name2</Name>
      <Value>value2</Value>
    </ItemCollection>
    <ItemCollection>
      <Name>name3</Name>
      <Value>value3</Value>
    </ItemCollection>
</ItemSets>
</root>

We want to do stuff based the values of keys. But due to the field naming, a “naïve” approach using spath’d dot notation won’t work, because you have n “root.ItemSet.ItemCollection.Name” fields in the same event.

... | spath | search root.ItemSets.ItemCollection.Name="name1"

And I cannot rely on the ordering, so using an array index won't help. I started down the route of xpath:

... | xpath "//ItemCollection/Name" outfield=xml | search xml="name1"

But that hasn't gotten me any closer, because the goal isn't to find the one element, it's to associate that element value (say "name1") to its sibling ("value1") in a search, and then either return the value, or he whole event.

In pseudocode, I want this:

for each ItemCollection {
  if Name = "name1" {
    print Value # or print Event
  }
}

Been playing around with makemv, but don't have anything worth showing off and I'm not sold on that technique anyway.

1 Solution

javiergn
SplunkTrust
SplunkTrust

Would this maybe work for you?

| stats count | fields - count
| eval _raw = "
 <root>
   <ItemSets>
     <ItemCollection>
       <Name>name1</Name>
       <Value>value1</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name2</Name>
       <Value>value2</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name3</Name>
       <Value>value3</Value>
     </ItemCollection>
 </ItemSets>
 </root>
"
| spath path=root.ItemSets.ItemCollection output=ItemCollections
| mvexpand ItemCollections
| spath input=ItemCollections
| search Name=name1
| table Value

Output:

Value
-----------
value1 

View solution in original post

javiergn
SplunkTrust
SplunkTrust

Would this maybe work for you?

| stats count | fields - count
| eval _raw = "
 <root>
   <ItemSets>
     <ItemCollection>
       <Name>name1</Name>
       <Value>value1</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name2</Name>
       <Value>value2</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name3</Name>
       <Value>value3</Value>
     </ItemCollection>
 </ItemSets>
 </root>
"
| spath path=root.ItemSets.ItemCollection output=ItemCollections
| mvexpand ItemCollections
| spath input=ItemCollections
| search Name=name1
| table Value

Output:

Value
-----------
value1 

sloshburch
Splunk Employee
Splunk Employee

that is so friggin cool. thanks for showing that!

0 Karma

halr9000
Motivator

Thanks! Accepting this one because:

  • XPath makes my head hurt
  • This solution feels really Splunky and I really needed a good mvexpand example.
0 Karma

woodcock
Esteemed Legend

Like this:

... | rex max_match=0 field=_raw "(?msi)<Name>(?<KEY>.*?)<\/Name>\s+<Value>(?<VAL>.*?)<\/Value>"
| eval _raw=mvzip(KEY,VAL, "=") | fields - KEY VAL
| extract limit=0 mv_add=t kvdelim="="

This give you:

name1     name2     name3
value1    value2    value3
0 Karma

halr9000
Motivator
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

@halr9000 - is the doc above equal to one event ?

0 Karma

halr9000
Motivator

Yup one event

0 Karma

ekim_splunk
Splunk Employee
Splunk Employee

What you can do is to add a expression (square bracket notation in XPath) at the specific position of the XPath expression, which will give you that "where" type of filtering.

If you try the following example where you're looking for Name = 'name2' ...

|makeresults
| eval _raw="<root><ItemSets><ItemCollection><Name>name1</Name><Value>value1</Value></ItemCollection><ItemCollection><Name>name2</Name><Value>value2</Value></ItemCollection><ItemCollection><Name>name3</Name><Value>value3</Value></ItemCollection></ItemSets></root>"
| xpath "//root/ItemSets/ItemCollection[Name/text()='name2']/Value" outfield=value

You should get the result "value2" in the value field.

halr9000
Motivator

After playing for another 30 min with the actual data, I could not get this technique to work. I've decided that it's me, and that I don't like XPath. 🙂

0 Karma

ekim_splunk
Splunk Employee
Splunk Employee

If this were 10 years ago, when XML was all the rage, I'd wax eloquently about why you should love XPath. 😉 If ever you did want to get deep down and dirty with XPath (which sounds like you probably won't), I'd always be happy to help.

0 Karma

halr9000
Motivator

thx! I knew your solution was possible, but syntax was killing me. That "/text()" part in particular was not obvious even after reading specs and examples.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...