Splunk's xpath documentation does not show any examples on how to use the xpath command if the XML contains namespace declarations, e.g. <event xmlns='mynamespace'> or <prefix:Event xmlns:prefix='mynamespace'>. The xpath command will not extract any results unless the event is modified and the namespace declaration(s) removed from the event first. Probably the most used workaround would be using the spath command instead.
However, after some googling about the path syntax for XPath you find there is a special local-name() notation that can be used so the namespace declarations are ignored during parsing.
The following run anywhere search demonstrates how to use local-name() notation with the xpath command to extract field values (note, the xmlkv command works well, but not on node attribute values, e.g. Name=<value> used in the example below.
| makeresults
| eval _raw="<Event>
<System>
<Provider Name='A'/>
</System>
</Event>
<Event xmlns='nameSpace'>
<System xmlns='anotherNameSpace'>
<Provider Name='B'/>
</System>
</Event>
<Event xmlns='nameSpace'>
<System a='attribute'>
<Provider Name='C'/>
</System>
</Event>
<e:Event xmlns:e='prefixed/nameSpace'>
<s:System xmlns:s='moreNameSpace'>
<Provider Name='D'>X</Provider>
<Provider Name='E'>Z</Provider>
</s:System>
</e:Event>"
``` examples of using xpath with XML that contains namespace declarations ```
| xpath outfield=name_no_ns "//Provider/@Name"
| xpath outfield=name_with_ns1 "//*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns2 "./*/*[local-name()='System'][@a='attribute']/*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns3 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']/@Name"
| xpath outfield=value_with_ns1 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']"
| xpath outfield=value_with_ns2 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider'][@Name='E']"
``` spath also provides another method to extract XML values ```
| spath output=spath_attribute path=Event.System{2}.Provider{@Name}
| spath output=spath_value path=e:Event.s:System.Provider
I raised this with the Splunk documentation team and hopefully they'll add an extended example like the one above to demonstrate namespace support when using xpath.
One last thing, there is currently a bug in the xpath command and if the XML has prolog declarations (e.g. <?xml version=1.0?> or <!DOCTYPE ....> then xpath does not work. I've raised a support case about this. A workaround is modifying the event and removing the prolog declarations, or using spath command instead.
Hope this helps anyone else who experiences this issue.
The following run anywhere search demonstrates how to use local-name() notation with the xpath command to extract field values (note, the xmlkv command works well, but not on node attribute values, e.g. Name=<value> used in the example below.
| makeresults
| eval _raw="<Event>
<System>
<Provider Name='A'/>
</System>
</Event>
<Event xmlns='nameSpace'>
<System xmlns='anotherNameSpace'>
<Provider Name='B'/>
</System>
</Event>
<Event xmlns='nameSpace'>
<System a='attribute'>
<Provider Name='C'/>
</System>
</Event>
<e:Event xmlns:e='prefixed/nameSpace'>
<s:System xmlns:s='moreNameSpace'>
<Provider Name='D'>X</Provider>
<Provider Name='E'>Z</Provider>
</s:System>
</e:Event>"
``` examples of using xpath with XML that contains namespace declarations ```
| xpath outfield=name_no_ns "//Provider/@Name"
| xpath outfield=name_with_ns1 "//*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns2 "./*/*[local-name()='System'][@a='attribute']/*[local-name()='Provider']/@Name"
| xpath outfield=name_with_ns3 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']/@Name"
| xpath outfield=value_with_ns1 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider']"
| xpath outfield=value_with_ns2 "/*[name()='e:Event' and namespace-uri()='prefixed/nameSpace']/*[name()='s:System']/*[name()='Provider'][@Name='E']"
``` spath also provides another method to extract XML values ```
| spath output=spath_attribute path=Event.System{2}.Provider{@Name}
| spath output=spath_value path=e:Event.s:System.Provider
I raised this with the Splunk documentation team and hopefully they'll add an extended example like the one above to demonstrate namespace support when using xpath.
One last thing, there is currently a bug in the xpath command and if the XML has prolog declarations (e.g. <?xml version=1.0?> or <!DOCTYPE ....> then xpath does not work. I've raised a support case about this. A workaround is modifying the event and removing the prolog declarations, or using spath command instead.
Hope this helps anyone else who experiences this issue.
Created an answer with workaround for the xpath and prolog header line issue here:
https://community.splunk.com/t5/Splunk-Search/The-xpath-command-does-not-work-with-XML-prolog-header...