Splunk Search

parsing XML data with hierarchy of KV pairs

Motivator

Banging my head on this one for too long, could use some help.

Take a sample doc such as the below, where you have a hierarchy of KV pairs:

<root>
  <ItemSets>
    <ItemCollection>
      <Name>name1</Name>
      <Value>value1</Value>
    </ItemCollection>
    <ItemCollection>
      <Name>name2</Name>
      <Value>value2</Value>
    </ItemCollection>
    <ItemCollection>
      <Name>name3</Name>
      <Value>value3</Value>
    </ItemCollection>
</ItemSets>
</root>

We want to do stuff based the values of keys. But due to the field naming, a “naïve” approach using spath’d dot notation won’t work, because you have n “root.ItemSet.ItemCollection.Name” fields in the same event.

... | spath | search root.ItemSets.ItemCollection.Name="name1"

And I cannot rely on the ordering, so using an array index won't help. I started down the route of xpath:

... | xpath "//ItemCollection/Name" outfield=xml | search xml="name1"

But that hasn't gotten me any closer, because the goal isn't to find the one element, it's to associate that element value (say "name1") to its sibling ("value1") in a search, and then either return the value, or he whole event.

In pseudocode, I want this:

for each ItemCollection {
  if Name = "name1" {
    print Value # or print Event
  }
}

Been playing around with makemv, but don't have anything worth showing off and I'm not sold on that technique anyway.

1 Solution

SplunkTrust
SplunkTrust

Would this maybe work for you?

| stats count | fields - count
| eval _raw = "
 <root>
   <ItemSets>
     <ItemCollection>
       <Name>name1</Name>
       <Value>value1</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name2</Name>
       <Value>value2</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name3</Name>
       <Value>value3</Value>
     </ItemCollection>
 </ItemSets>
 </root>
"
| spath path=root.ItemSets.ItemCollection output=ItemCollections
| mvexpand ItemCollections
| spath input=ItemCollections
| search Name=name1
| table Value

Output:

Value
-----------
value1 

View solution in original post

SplunkTrust
SplunkTrust

Would this maybe work for you?

| stats count | fields - count
| eval _raw = "
 <root>
   <ItemSets>
     <ItemCollection>
       <Name>name1</Name>
       <Value>value1</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name2</Name>
       <Value>value2</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name3</Name>
       <Value>value3</Value>
     </ItemCollection>
 </ItemSets>
 </root>
"
| spath path=root.ItemSets.ItemCollection output=ItemCollections
| mvexpand ItemCollections
| spath input=ItemCollections
| search Name=name1
| table Value

Output:

Value
-----------
value1 

View solution in original post

Ultra Champion

that is so friggin cool. thanks for showing that!

0 Karma

Motivator

Thanks! Accepting this one because:

  • XPath makes my head hurt
  • This solution feels really Splunky and I really needed a good mvexpand example.
0 Karma

Esteemed Legend

Like this:

... | rex max_match=0 field=_raw "(?msi)<Name>(?<KEY>.*?)<\/Name>\s+<Value>(?<VAL>.*?)<\/Value>"
| eval _raw=mvzip(KEY,VAL, "=") | fields - KEY VAL
| extract limit=0 mv_add=t kvdelim="="

This give you:

name1     name2     name3
value1    value2    value3
0 Karma

Motivator
0 Karma

Splunk Employee
Splunk Employee

@halr9000 - is the doc above equal to one event ?

0 Karma

Motivator

Yup one event

0 Karma

Splunk Employee
Splunk Employee

What you can do is to add a expression (square bracket notation in XPath) at the specific position of the XPath expression, which will give you that "where" type of filtering.

If you try the following example where you're looking for Name = 'name2' ...

|makeresults
| eval _raw="<root><ItemSets><ItemCollection><Name>name1</Name><Value>value1</Value></ItemCollection><ItemCollection><Name>name2</Name><Value>value2</Value></ItemCollection><ItemCollection><Name>name3</Name><Value>value3</Value></ItemCollection></ItemSets></root>"
| xpath "//root/ItemSets/ItemCollection[Name/text()='name2']/Value" outfield=value

You should get the result "value2" in the value field.

Motivator

After playing for another 30 min with the actual data, I could not get this technique to work. I've decided that it's me, and that I don't like XPath. 🙂

0 Karma

Splunk Employee
Splunk Employee

If this were 10 years ago, when XML was all the rage, I'd wax eloquently about why you should love XPath. 😉 If ever you did want to get deep down and dirty with XPath (which sounds like you probably won't), I'd always be happy to help.

0 Karma

Motivator

thx! I knew your solution was possible, but syntax was killing me. That "/text()" part in particular was not obvious even after reading specs and examples.

0 Karma