Splunk Search

Using the xpath command with namespace declarations (xmlns='mynamespace')

yeahnah
Motivator

Splunk's xpath documentation does not show any examples on how to use the xpath command if the XML contains namespace declarations, e.g. <event xmlns='mynamespace'> or <prefix:Event xmlns:prefix='mynamespace'>.   The xpath command will not extract any results unless the event is modified and the namespace declaration(s) removed from the event first.  Probably the most used workaround would be using the spath command instead.

However, after some googling about the path syntax for XPath you find there is a special local-name() notation that can be used so the namespace declarations are ignored during parsing.

0 Karma
1 Solution

yeahnah
Motivator

The following run anywhere search demonstrates how to use local-name() notation with the xpath command to extract field values (note, the xmlkv command works well, but not on node attribute values, e.g. Name=<value> used in the example below.

| makeresults
| eval _raw="<Event>
  <System>
    <Provider Name='A'/>
  </System>
</Event>
<Event xmlns='nameSpace'>
  <System xmlns='anotherNameSpace'>
    <Provider Name='B'/>
  </System>
</Event>
<Event xmlns='nameSpace'>
  <System a='attribute'>
    <Provider Name='C'/>
  </System>
</Event>
<e:Event xmlns:e='prefixed/nameSpace'>
  <s:System xmlns:s='moreNameSpace'>
    <Provider Name='D'>X</Provider>
    <Provider Name='E'>Z</Provider>
  </s:System>
</e:Event>"
  ``` examples of using xpath with XML that contains namespace declarations ```
| xpath outfield=name_no_ns "//Provider/@Name"
| xpath outfield=name_with_ns1 "//*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns2 "./*/*[local-name()='System'][@a='attribute']/*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns3 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']/@Name"
| xpath outfield=value_with_ns1 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']"
| xpath outfield=value_with_ns2 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider'][@Name='E']"
  ``` spath also provides another method to extract XML values ```
| spath output=spath_attribute path=Event.System{2}.Provider{@Name}   
| spath output=spath_value path=e:Event.s:System.Provider


I raised this with the Splunk documentation team and hopefully they'll add an extended example like the one above to demonstrate namespace support when using xpath.

One last thing, there is currently a bug in the xpath command and if the XML has prolog declarations (e.g. <?xml version=1.0?> or <!DOCTYPE ....> then xpath does not work.  I've raised a support case about this.  A workaround is modifying the event and removing the prolog declarations, or using spath command instead.

Hope this helps anyone else who experiences this issue.

View solution in original post

0 Karma

yeahnah
Motivator

The following run anywhere search demonstrates how to use local-name() notation with the xpath command to extract field values (note, the xmlkv command works well, but not on node attribute values, e.g. Name=<value> used in the example below.

| makeresults
| eval _raw="<Event>
  <System>
    <Provider Name='A'/>
  </System>
</Event>
<Event xmlns='nameSpace'>
  <System xmlns='anotherNameSpace'>
    <Provider Name='B'/>
  </System>
</Event>
<Event xmlns='nameSpace'>
  <System a='attribute'>
    <Provider Name='C'/>
  </System>
</Event>
<e:Event xmlns:e='prefixed/nameSpace'>
  <s:System xmlns:s='moreNameSpace'>
    <Provider Name='D'>X</Provider>
    <Provider Name='E'>Z</Provider>
  </s:System>
</e:Event>"
  ``` examples of using xpath with XML that contains namespace declarations ```
| xpath outfield=name_no_ns "//Provider/@Name"
| xpath outfield=name_with_ns1 "//*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns2 "./*/*[local-name()='System'][@a='attribute']/*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns3 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']/@Name"
| xpath outfield=value_with_ns1 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']"
| xpath outfield=value_with_ns2 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider'][@Name='E']"
  ``` spath also provides another method to extract XML values ```
| spath output=spath_attribute path=Event.System{2}.Provider{@Name}   
| spath output=spath_value path=e:Event.s:System.Provider


I raised this with the Splunk documentation team and hopefully they'll add an extended example like the one above to demonstrate namespace support when using xpath.

One last thing, there is currently a bug in the xpath command and if the XML has prolog declarations (e.g. <?xml version=1.0?> or <!DOCTYPE ....> then xpath does not work.  I've raised a support case about this.  A workaround is modifying the event and removing the prolog declarations, or using spath command instead.

Hope this helps anyone else who experiences this issue.

0 Karma

yeahnah
Motivator

Created an answer with workaround for the xpath and prolog header line issue here: 
https://community.splunk.com/t5/Splunk-Search/The-xpath-command-does-not-work-with-XML-prolog-header...

0 Karma
Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...