Splunk Search

Search with XPath

gallantalex
Path Finder

Hi I am having a problem searching an xml formated event. So basically I have an event that looks like this:

<?xml version="1.0" ?>
<Products>
  <Product name="CodeAnalyzer" version="2"/>
  <Product name="ScmKitCommon" version="2">
    <Component name="ScmNantTasks" version="2"/>
  </Product>
  <Product name="ScmKitInternal" version="1">
    <Component name="ScmToolsProjectConfiguration" version="1"/>
    <Component name="StateObjects" version="1"/>
    <Component name="XsdMaint" version="1"/>
  </Product>
  <Product name="ScmKitProduction" version="2.0.0.9"/>
</Products>

This event was indexed from a script and not an actual xml file, so I don't know if that makes a difference. What I would like is to list all the product names for this event. I have something like this:

... | dedup 1 host | xpath "//Products/Product/@name" outfield=name | table name

But all it lists is CodeAnalyzer. I also changed the commands.conf file in the search app to but nothing changed:

 [xpath]
 supports_multivalues = true

Actually when I tried an xpath that is completely wrong, I still got the same result. What am I missing, how is xpath supposed to be used?

Thanks.

Tags (1)

gallantalex
Path Finder

Thanks for the suggestion. I got the results I was looking for using this:

... | rex max_match=100 "<Product name=(?<name>.*?) version" | table name host

I tried to use xpath again and figured out the problem. First of all the xml tag was causing the xpath expression to fail. Once I removed it from my scripts output, my xpath expressions worked at times.

<?xml version="1.0" ?>

Secondly, only xpath expressions that began with '//' worked. So something like:

... | xpath "//Products/Product/@name"

would get me the right attribute stored in xpath field. But

... | xpath "/Products/Product/@name"

would not work even though Products is the first element. Also when I use the default value, it seems to overwrite the field even though it exists. Well, I just glad it finally works.

gallantalex
Path Finder

My xpath was correct and the data was structured correctly as well. I used outside programs to double check my xpaths and xml data. That was never the problem.
But your responses made me try xpath again and I figured out what the problem was. I will edit my response with what I found out.

0 Karma

Genti
Splunk Employee
Splunk Employee

again, did you try to run xpath on your command line? (not within splunk?)
If you want, paste the entire xml, and i can run the test on it, but if it works on cli, it should, and will work in splunk..

0 Karma

Lowell
Super Champion

If your XML is always broken in the same way, it would be possible to fix it with rex.

I haven't tested this, but this should get you started:

... | rex mode=sed "s/(<Product [^>]+\/>)\s*<\/Product>/$1/g" | xpath "//Products/Product/@name" outfield=name | ...


However, I suppose the question should be asked, do you ever have any other "name" fields within your data? If not, a much simplier approach would be this:

 ... | rex max_match=100 "<Product name=([\"'])(?<name>\w+(\1)" | table name

You may need to use something more complicated that "\w+", but that's the idea.

Genti
Splunk Employee
Splunk Employee

i used your example (saved it as new.xml) to run a quick check:

xpath new.xml "//Products/Product/@name"

here is the result:

mismatched tag at line 13, column 4, byte 462:
  </Product>
  <Product name="ScmKitProduction" version="2.0.0.9"/>
  </Product>
===^
</Products>
 at /System/Library/Perl/Extras/5.10.0/darwin-thread-multi-2level/XML/Parser.pm line 187

Your xml is wrong, and xpath cannot parse it, and hence you get nothing from splunk, same as if you were running an xpath that is completely wrong..

remove the last </Product> right before the </Products> line and test your xpath once again:
in other words your xml should be:

<?xml version="1.0" ?>
<Products>
  <Product name="CodeAnalyzer" version="2"/>
  <Product name="ScmKitCommon" version="2">
    <Component name="ScmNantTasks" version="2"/>
  </Product>
  <Product name="ScmKitInternal" version="1">
    <Component name="ScmToolsProjectConfiguration" version="1"/>
    <Component name="StateObjects" version="1"/>
    <Component name="XsdMaint" version="1"/>
  </Product>
  <Product name="ScmKitProduction" version="2.0.0.9"/>
</Products>

~
Hope this helps 😉
.gz

ps: here is the result for the above xpath in the cli:

gzaimi@bigmac ~/Testing/logs xpath new.xml "//Products/Product/@name"
Found 4 nodes:
-- NODE --
 name="CodeAnalyzer"-- NODE --
 name="ScmKitCommon"-- NODE --
 name="ScmKitInternal"-- NODE --
 name="ScmKitProduction"

gallantalex
Path Finder

Good find, but that was just my mistake when I posted the question. I removed a bunch of other Product nodes to shorten the example and was a little careless apparently.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...