Splunk Search

How to write the regex to extract a field from XML data if the field is not completely XML?

jameskerivan
Explorer

Hi

I have a field which I would like to extract a field from the XML being displayed. The only problem is the field is not completely XML. I am not allowed to post an example, but basically I want to extract something that looks like:

Event xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"><ns2:behaviorVersion>0</ns2:behaviorVersion><triggers><channelId>0055</channelId><clientVersion>3</clientVersion></triggers><eventInfo><bos:instanceId>000121481</bos:instanceId><bos:serverName>1</bos:serverName><bos:implementationName>TransferStarted</bos:implementationName>

And I would like to grab TransferStarted in between the two tags <bos:implementationName> and </bos:implementationName>.

I have worked with regex in the past, but am still not confident. Any help would be much appreciated and Happy New Year!

0 Karma
1 Solution

sundareshr
Legend

Have you tried this

implementationName\>(\w+)\<

View solution in original post

sundareshr
Legend

Have you tried this

implementationName\>(\w+)\<

jameskerivan
Explorer

Yes this is what I want. Right now I am doing

base query | rex field=F "(?.*)implementationName\>(\w+)\<" | stats count by preName | sort count desc

But this is providing me with everything before implementationName as I specified. How would I extract that field? The way I see the regex working is it matches implementationName and looks for the characters > < for opening and closing of the value I want. Do I need to specify a variable for that value?

0 Karma

sundareshr
Legend

Try this, assuming preName is the name you want for that field.

"implementationName>(?<preName>w+)<"
0 Karma

sundareshr
Legend

There should be a backslash before "w+"

0 Karma

jameskerivan
Explorer

So the stats that it gives me is very confusing. Here is my query :

base query | rex field=F "implementationName>(?<preName>\w+)<" | stats count by preName | sort count desc

This is giving me a very small amount of the implemenationNames but it does not give them all. For example TransferStarted did not get counted in my stats but if I look in the events I can see it. Am I missing something?

0 Karma

sundareshr
Legend

If there is more than 1 occurrence of the preName in one event, you should add max_match=0 to the rex command and used multi-value functions to get the right result

0 Karma

jameskerivan
Explorer

Thank you very much. You have been so helpful. The problem I am coming across is with the way we are logging. Your query is correct!

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...