Splunk Search

parsing XML data with hierarchy of KV pairs

halr9000
Motivator

Banging my head on this one for too long, could use some help.

Take a sample doc such as the below, where you have a hierarchy of KV pairs:

<root>
  <ItemSets>
    <ItemCollection>
      <Name>name1</Name>
      <Value>value1</Value>
    </ItemCollection>
    <ItemCollection>
      <Name>name2</Name>
      <Value>value2</Value>
    </ItemCollection>
    <ItemCollection>
      <Name>name3</Name>
      <Value>value3</Value>
    </ItemCollection>
</ItemSets>
</root>

We want to do stuff based the values of keys. But due to the field naming, a “naïve” approach using spath’d dot notation won’t work, because you have n “root.ItemSet.ItemCollection.Name” fields in the same event.

... | spath | search root.ItemSets.ItemCollection.Name="name1"

And I cannot rely on the ordering, so using an array index won't help. I started down the route of xpath:

... | xpath "//ItemCollection/Name" outfield=xml | search xml="name1"

But that hasn't gotten me any closer, because the goal isn't to find the one element, it's to associate that element value (say "name1") to its sibling ("value1") in a search, and then either return the value, or he whole event.

In pseudocode, I want this:

for each ItemCollection {
  if Name = "name1" {
    print Value # or print Event
  }
}

Been playing around with makemv, but don't have anything worth showing off and I'm not sold on that technique anyway.

1 Solution

javiergn
Super Champion

Would this maybe work for you?

| stats count | fields - count
| eval _raw = "
 <root>
   <ItemSets>
     <ItemCollection>
       <Name>name1</Name>
       <Value>value1</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name2</Name>
       <Value>value2</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name3</Name>
       <Value>value3</Value>
     </ItemCollection>
 </ItemSets>
 </root>
"
| spath path=root.ItemSets.ItemCollection output=ItemCollections
| mvexpand ItemCollections
| spath input=ItemCollections
| search Name=name1
| table Value

Output:

Value
-----------
value1 

View solution in original post

javiergn
Super Champion

Would this maybe work for you?

| stats count | fields - count
| eval _raw = "
 <root>
   <ItemSets>
     <ItemCollection>
       <Name>name1</Name>
       <Value>value1</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name2</Name>
       <Value>value2</Value>
     </ItemCollection>
     <ItemCollection>
       <Name>name3</Name>
       <Value>value3</Value>
     </ItemCollection>
 </ItemSets>
 </root>
"
| spath path=root.ItemSets.ItemCollection output=ItemCollections
| mvexpand ItemCollections
| spath input=ItemCollections
| search Name=name1
| table Value

Output:

Value
-----------
value1 

sloshburch
Splunk Employee
Splunk Employee

that is so friggin cool. thanks for showing that!

0 Karma

halr9000
Motivator

Thanks! Accepting this one because:

  • XPath makes my head hurt
  • This solution feels really Splunky and I really needed a good mvexpand example.
0 Karma

woodcock
Esteemed Legend

Like this:

... | rex max_match=0 field=_raw "(?msi)<Name>(?<KEY>.*?)<\/Name>\s+<Value>(?<VAL>.*?)<\/Value>"
| eval _raw=mvzip(KEY,VAL, "=") | fields - KEY VAL
| extract limit=0 mv_add=t kvdelim="="

This give you:

name1     name2     name3
value1    value2    value3
0 Karma

halr9000
Motivator
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

@halr9000 - is the doc above equal to one event ?

0 Karma

halr9000
Motivator

Yup one event

0 Karma

ekim_splunk
Splunk Employee
Splunk Employee

What you can do is to add a expression (square bracket notation in XPath) at the specific position of the XPath expression, which will give you that "where" type of filtering.

If you try the following example where you're looking for Name = 'name2' ...

|makeresults
| eval _raw="<root><ItemSets><ItemCollection><Name>name1</Name><Value>value1</Value></ItemCollection><ItemCollection><Name>name2</Name><Value>value2</Value></ItemCollection><ItemCollection><Name>name3</Name><Value>value3</Value></ItemCollection></ItemSets></root>"
| xpath "//root/ItemSets/ItemCollection[Name/text()='name2']/Value" outfield=value

You should get the result "value2" in the value field.

halr9000
Motivator

After playing for another 30 min with the actual data, I could not get this technique to work. I've decided that it's me, and that I don't like XPath. 🙂

0 Karma

ekim_splunk
Splunk Employee
Splunk Employee

If this were 10 years ago, when XML was all the rage, I'd wax eloquently about why you should love XPath. 😉 If ever you did want to get deep down and dirty with XPath (which sounds like you probably won't), I'd always be happy to help.

0 Karma

halr9000
Motivator

thx! I knew your solution was possible, but syntax was killing me. That "/text()" part in particular was not obvious even after reading specs and examples.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...