I have a log file entry that looks like this (this is the VERBATIM entry from the access log):
2012-08-06 13:25:02,159 INFO [Listener-5] Listener - DeviceData processed: execution-time=[540ms], message=[< ?xml version="1.0" encoding="UTF-8" standalone="yes" ? > < ns2:DeviceData xmlns:ns2="http://www.abc.com/DeviceData/" > < TransactionId>1234<\/TransactionId> ..... (the rest of the xml content followed by other typical access log data)
I dont know if this necessarily counts as a "XML LOG FILE" per se.
But if i want to extract the transactionId (highlighted in bold), I attempt to use xmlkv and it just fails:
sourcetype="access_c*" | xmlkv | table TransactionId
Is this the wrong way about this ? Should I just use regexes ? Any other Splunk commands that I ought to be using ASSUMING that this file does NOT qualify as a XML Log file ?
I would do it this way:
sourcetype=access_c*
| rex "TransactionId>(?<TransactionId>.*?)\</TransactionId>"
| table TransactionId
I don't think this format qualifies as an XML log file; only the message seems to be in xml format. Also, the xmlkv command is not very fast. As you have used it, it would extract every field, not just the TransactionId - if it worked at all.
If you need to extract all the fields from the message, you could use spath, like this
sourcetype=access_c*
| spath input=message
the sourcetype="access_c*" | xmlkv | table TransactionId seem to work , why we need a rex "TransactionId>(?.*?)\" ? is it only to tell a wild card to pick every occurance of the TransactionID ?
I would do it this way:
sourcetype=access_c*
| rex "TransactionId>(?<TransactionId>.*?)\</TransactionId>"
| table TransactionId
I don't think this format qualifies as an XML log file; only the message seems to be in xml format. Also, the xmlkv command is not very fast. As you have used it, it would extract every field, not just the TransactionId - if it worked at all.
If you need to extract all the fields from the message, you could use spath, like this
sourcetype=access_c*
| spath input=message