Getting Data In

Automatically extract json object embedded in xml using props/transforms

charlesmeo
Explorer

Hi there, I'm facing an interesting problem with fairly complex logs consisting of one or more xml namespaces, some of which have JSON objects embedded in them. I have this working fairly reliably using a transform-based field extraction and  SPL (the logs are all over the place, separate problem!) but I was wondering if anyone can suggest how to do this automatically using props and transforms without having to use spath in a base search to do all the finagling.

transform extraction:

REGEX=<([:\w+)>([^<]+)

FORMAT=$1::$2

This gets out all the xml tags, which generally have the format tns:SomethingOrOther, and associated values.

The json bit is usually in tns:Payload

SPL:

index=TheIndex sourcetype=theSourcetype | spath input=tns:Payload

This is reliable enough but I'd like to recommend improvements to the site.

The more general question here is, where you have complex logs of this type, how can you configure props/transforms to do the right operations and in the right sequence? i.e. xml first, then json extraction. If you try the other way round, it wouldn't work--would it? If you had the opposite problem, xml embedded in json, how would you do that? Is it even possible to control the order in which props attempts structured extractions?

Certainly props supports extracting both types of values, but how do you know which one it tries first if you configure both? Props seems to assume you're only looking at one type of thing per sourcetype, not the case here--unless I've missed something.

Of course if props simply executes them in order encountered in the file, there isn't much of an issue. However I have some constraints here which prevent me from just experimenting with it:

--no access to sourcetype configuration, nor am I likely to get it

--cannot download any of the data, which is commercial in confidence and the site take this very, very seriously

--no dev or test environment (!) Strange but true for an enterprise of this size and prominence.

 

 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

In cybersecurity, defenders respond to threats. Architects design the systems that stop them.    As ...

Best Practices: Splunk auto adjust pipeline queue

When you enable autoAdjustQueue in Splunk, maxSize should be understood as the queue size Splunk starts with ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...