Automatically extract json object embedded in xml ...

charlesmeo · ‎08-19-2020

Hi there, I'm facing an interesting problem with fairly complex logs consisting of one or more xml namespaces, some of which have JSON objects embedded in them. I have this working fairly reliably using a transform-based field extraction and SPL (the logs are all over the place, separate problem!) but I was wondering if anyone can suggest how to do this automatically using props and transforms without having to use spath in a base search to do all the finagling.

transform extraction:

REGEX=<([:\w+)>([^<]+)

FORMAT=$1::$2

This gets out all the xml tags, which generally have the format tns:SomethingOrOther, and associated values.

The json bit is usually in tns:Payload

SPL:

index=TheIndex sourcetype=theSourcetype | spath input=tns:Payload

This is reliable enough but I'd like to recommend improvements to the site.

The more general question here is, where you have complex logs of this type, how can you configure props/transforms to do the right operations and in the right sequence? i.e. xml first, then json extraction. If you try the other way round, it wouldn't work--would it? If you had the opposite problem, xml embedded in json, how would you do that? Is it even possible to control the order in which props attempts structured extractions?

Certainly props supports extracting both types of values, but how do you know which one it tries first if you configure both? Props seems to assume you're only looking at one type of thing per sourcetype, not the case here--unless I've missed something.

Of course if props simply executes them in order encountered in the file, there isn't much of an issue. However I have some constraints here which prevent me from just experimenting with it:

--no access to sourcetype configuration, nor am I likely to get it

--cannot download any of the data, which is commercial in confidence and the site take this very, very seriously

--no dev or test environment (!) Strange but true for an enterprise of this size and prominence.

Automatically extract json object embedded in xml using props/transforms

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

What’s New in Splunk AI: Volume 02

Splunk App Dev Quarterly Roundup: AI, Agents, and Innovation!

Value Insights: Now Generally Available in the CMC

Join the Conversation