Getting Data In

XML File with namespaces parsing


i All,

I have a log which as events as xml with namespace/xsl. Example log

<soap:Envelope xsi:schemaLocation=" 
xmlns:ds="" xmlns:xsi="">
<audit:BroadcastAudit version="1.1">
<xcs:Message><****this is also some xml******></xcs:Message>

I am able to index the same as proper timestamp recognition.
What I want to do is to extract the fields automatically from the tags like DeskSys, MessageTime, MessageDate and also fields from Message which is again an xml.
I tried with KV_MODE = xml in props.conf and the fields I am getting are having namespace also associated with each (e.g. soap:Envelop:requestheader:SendintTimestamp= 2013-11-07T17:50:07-05:00).

Is there any way to get the fields, automatically, without any namespace/xsl?
Appreciate your help.

Tags (2)
0 Karma


Here's a more generic approach:
The following refinement of @martinh3's approach will remove all namespace prefixes (leaving the namespace declarations, which will simply do nothing) in one hit:

rex field=_raw mode=sed "s/(<\/?)([\w\d-]+):(\w+)([ \/>])/\1\3\4/g"

This will remove all namespace prefixes made up of word characters, numbers or "-".

If you are simply applying this to the whole raw message, then you can actually leave out 'field=raw' or if you have extracted your XML into a field as part of a search, the replace 'field=raw' with 'field=yourfieldname'.

0 Karma

New Member

Might not be the correct way, but the only way I found to do it is by deleting the namespaces. I had a few different ones in my file, so I needed 3 different "sed" statements to remove each. Like:

... | rex mode=sed "s/namespace1://g" | rex "begin XML: (?.*)" ...

0 Karma