Splunk Search

Xml Parsing with xpath

Communicator

Hello,

I am trying to use xpath to retrieve certain fields from my xml file. The file looks something like this

<machines>  
    <machine name=a port=1 active="true">  
        <value>  
            <replace>Example<replace/>  
            <cfg>  
                <serverUrl>URL1</serverUrl>  
                <serverUsername>Username1</serverUsername>  
            </cfg>  
        </value>  
    </machine>  
<machine>    
    <machine name=b port=2 active="true">  
        <value>  
            <replace>Example<replace/>  
            <cfg>  
                <serverUrl>URL2</serverUrl>  
                <serverUsername>Username2</serverUsername>  
            </cfg>  
       </value>  
    </machine>  
</machines>

I want to pick out certain fields and link them together. For example, one report is to get the machine port and the serverUrl and show a table. E.g it would be

Node   URL
1      URL1   
2      URL2

The following commands get me the info I need. But if i put them together, the first breaks.

Command is as follows

index=DTS sourcetype=config |  xpath "//machine/@port" outfield=port | xpath "//machine/value/with/cfg/serverUrl" outfield=name

Separately these work correctly and I see multiple values coming up. E.g port =1, port = 2 OR name=a, name=b

If I pipe, i get port = 12, name=a, name=b ie it joins the first result into a string.

Any ideas how i can get the table i am looking for?

Thanks Hazel

Tags (3)

Splunk Employee
Splunk Employee

As gkanapathy said, this is a bug with the specification of the xpath command and its support for multivalued fields. You can fix this in your installation by adding to $SPLUNK_HOME/etc/apps/search/local/commands.conf:

[xpath]
supports_multivalues = true

Splunk Employee
Splunk Employee

Basically looks like a bug in the Splunk Intersplunk libraries, where it seems that incoming mulitvalued fields get their multivalued values discarded. Please file a bug with Splunk Support.

Thanks!

We'll see if we can maybe come up with a workaround...it looks like the original MV values are in there, separated by newline characters.


Here is a workaround. Add this to your search:

... | eval port=split(replace(port,"\n",";"),";")

This will only work right if the ; character does not appear anywhere in the field values. If it does, pick another character that doesn't appear as your delimiter for this workaround.

0 Karma

Splunk Employee
Splunk Employee

and dnolan. you can use a combination of both, using rex (or xpath) to first split the file into multiple parts, then mvexpand to break it into separate events, then use rex or xpath on each event.

0 Karma

Splunk Employee
Splunk Employee

That is another problem, and that is more what Kevin and Lowell are addressing.

0 Karma

Communicator

Hello again. I tried this, it works to split up the ports again. Do you know how I can get a link between the my two xpaths.. to know which port number goes with which server url?

0 Karma

Communicator

Thankyou. I thought perhaps I might not be using the correct approach. We are just looking to pick out certain values from the xml config and generate a report of those values. E.g in this case, we want to know which serverUrls go with which ports. I have raised a support case.

0 Karma

Explorer

I believe the problem is you're pulling out two multivalued fields, and not providing a way to associate between them. I think you may need to expand the one event into multiple events before doing the second query to get what you want.

I think you want to do something like this, but I don't think this actually works as it seems the eval() in the xpath command isn't executed and I can't see any other way to pass a field in as part of the xpath query. But I'm new to splunk, so maybe I'm missing something, heres what I tried:

... | xpath "//machine/@port" outfield=port | mvexpand port | xpath eval("//machine[@port=".port."]/value/cfg/serverUrl") outfield=serverUrl

Maybe rex is the only way to go.

Path Finder

How are your events defined?

If each line is its own event (no event breaking has been applied) you could just use rex.

If the whole file is treated as an event, you could create "sub" events using eval, breaking each file into multiple events at search time, then finding the values with rex.

You could also look into breaking up the events at index time (if applicable to your uses) by saying

LINEMERGE=true
BREAK_ONLY_BEFORE=<machine name

and then just use rex as mentioned above.

ps - I like rex 🙂

Communicator

Hello thanks the Break in "<machine name" works very well in props, but I would now need to do a sort of sub break and take the machine sections and break them up again (they have multiple value sections)... any idea about how to do this?

0 Karma

Super Champion

Kevin is suggesting that you make two events (based on your example); So you would have one each <machine> tag, not one event per line. But I'm not sure exactly sure how you would do this with an eval or a rex command. IMHO, such a search command would be nice addition to splunk.

0 Karma

Communicator

Hello - thanks for the suggestion. The config file comes through as one event, as this is how we need it for another use case. So i would prefer to do it at runtime. How are you suggesting to use rex? If the lines are their own event, then how can i link up the line that says the port number and the link that says the ems server url?

0 Karma

Builder

It looks like one difference here is that port is an attribute versus the element serverUrl. Out of curiosity, if you reverse it, does the same thing happen? That is, if you do: index=DTS sourcetype=config | xpath "//machine/value/with/cfg/serverUrl" outfield=name | xpath "//machine/@port" outfield=port

Do you get serverUrl=ab, port=1, port=2 ?

0 Karma

Communicator

Hello, yes if i switch them you get the same error but the other way round (serverUrl=ab, port=1, port=2)

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!