I have a log which as events as xml with namespace/xsl. Example log
<soap:Envelope xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/ xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soap:Header> <requestheader:RequestHeader> <requestheader:SendingTimeStamp>2013-11-07T17:50:07-05:00</requestheader:SendingTimeStamp> </requestheader:RequestHeader> <soap:Body> <audit:BroadcastAudit version="1.1"> <xcs:AuditInfo> <xcs:MessageDate>20131107</xcs:MessageDate> <xcs:MessageTime>175007-05:00</xcs:MessageTime> <xcs:DestSys>XXX</xcs:DestSys> <xcs:Message><****this is also some xml******></xcs:Message> </xcs:AuditInfo></audit:BroadcastAudit></soap:Body></soap:Envelope>
I am able to index the same as proper timestamp recognition.
What I want to do is to extract the fields automatically from the tags like DeskSys, MessageTime, MessageDate and also fields from Message which is again an xml.
I tried with KV_MODE = xml in props.conf and the fields I am getting are having namespace also associated with each (e.g. soap:Envelop:requestheader:SendintTimestamp= 2013-11-07T17:50:07-05:00).
Is there any way to get the fields, automatically, without any namespace/xsl?
Appreciate your help.
Here's a more generic approach:
The following refinement of @martinh3's approach will remove all namespace prefixes (leaving the namespace declarations, which will simply do nothing) in one hit:
rex field=_raw mode=sed "s/(<\/?)([\w\d-]+):(\w+)([ \/>])/\1\3\4/g"
This will remove all namespace prefixes made up of word characters, numbers or "-".
If you are simply applying this to the whole raw message, then you can actually leave out 'field=raw' or if you have extracted your XML into a field as part of a search, the replace 'field=raw' with 'field=yourfieldname'.
Might not be the correct way, but the only way I found to do it is by deleting the namespaces. I had a few different ones in my file, so I needed 3 different "sed" statements to remove each. Like:
... | rex mode=sed "s/namespace1://g" | rex "begin XML: (?.*)" ...