Splunk Search

Parsing XML data from fields

Kabobgub
Explorer

Hello, after researching a lot of information I still can not recorgnise how to solve this problem.
I have an xml file added to splunk, and I've extracted fields through KV_MODE = xml.

          <result name="MISCONF_STATUS.SUCCESS"><![CDATA[154]]></result>
          <result name="MISCONF_RISK.HIGH"><![CDATA[39]]></result>
          <result name="MISCONF_ALL"><![CDATA[606]]></result>

So I have two fields here: result{@name} and result. the second is CDATA value. But the problem is they are not connected between eachother.
how to define that MISCONF_STATUS.SUCCESS = 154? And so on.
I tried to make a chart using this two fields, but it is not working at all.

iamtags
Engager

I was running into the same problem where I only needed a simple table merging a couple of xml values from many, and potentially multiple times per event.

To build off of what sideview ♦ explained, and from the mvexpand docs, I think I have a way to help you get just the fields you care about in a simple table. Notice first few lines are same as what was already posted

| rename "result{@name}" as result_name
| fields result_name result
| eval zipped=mvzip(result_name,result)
| mvexpand zipped

This is where the code changes a little bit to meet what I think you are requesting. You can actually just rex out of the new field you just created

| rex field=zipped "(?<result_name>\S+),(?<result>\d+)"
| table result_name result

Should be displayed like

result_name            result
MISCONF_STATUS.SUCCESS 154
MISCONF_RISK.HIGH      39
MISCONF_ALL            606

These results are then connected so you could get only specific events by appending

| where result_name="MISCONF_ALL" AND result="606"

For some visualizations you can also change

| table result_name result 

to something like

| stats values(result_name) by result

Hope this helps

0 Karma

sideview
SplunkTrust
SplunkTrust

somesoni2's sed based approach may well be the best one, but here's some fun search language that can do the same.

I'm assuming that you have big multiline events that each have big multivalue values for your two fields "result{@name}" and "result"

| rename "result{@name}" as result_name
| eval zipped=mvzip(result_name,result)
| streamstats count as counter
| mvexpand zipped
| eval zipped = split(zipped,",")
| eval result = mvindex(zipped,0)
| eval {result}=mvindex(zipped,1)
| fields - zipped
| stats values(*) as * by counter
| fields - counter

It's a bit of a circus act but it'll work. eval's mvzip command can zip up two big multivalue values into a third multivalue field whose values look like "foo1,bar1" "foo2,bar2" etc. Then we kinda of take the results apart and put them back together again the way we need them.

0 Karma

Kabobgub
Explorer

Thanks. It should work, but in this case I have a table with ALL my fields displayed. Could you tell me how can I use only this two fields?

0 Karma

sideview
SplunkTrust
SplunkTrust

It will work fine with other field values. They should be carried along throughout.

0 Karma

Kabobgub
Explorer

The reason it is not suitable, that I have some junk fields in this case. All I need is to connect this two fields and have some visualisation of them. Thanks for your solution, but It differs a little from what I need. I will apreciate if you will give me some advice for my case

0 Karma

sideview
SplunkTrust
SplunkTrust

I'm afraid that I do not understand the problem you are trying to describe. Possibly because it is not a problem at all. can you describe why you think the other junk field values prevent this solution from giving you your visualization with this solution?

0 Karma

Kabobgub
Explorer

Problem is that this is part of very wide system and this search generated too much data for the current visualization. I will really appreciate if you will tell me, how can I customize this search or what commands I need to use for my goals. For example if I need to see values of MISCONF_RISK.HIGH only or values of MISCONF_ALL fields or values exept MISCONF_STATUS.SUCCESS. I've tried some ways to do it but is too complicated for me.

0 Karma

sideview
SplunkTrust
SplunkTrust

If you just want these two fields, then you want to insert a fields command to explicitly filter out all other fields.

| rename "result{@name}" as result_name
 | fields result_name result
 | eval zipped=mvzip(result_name,result)
 | streamstats count as counter
 | mvexpand zipped
 | eval zipped = split(zipped,",")
 | eval result = mvindex(zipped,0)
 | eval {result}=mvindex(zipped,1)
 | fields - zipped
 | stats values(*) as * by counter
 | fields - counter

If you're getting an error that the search generated too much data for the visualization, that has more to do with the visualization you're trying to use. For instance if you try to generate a 1 year timechart with a 5 minute granularity you'll get errors like that in the UI.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try something like this

your base search with field _raw |  rex mode=sed "s/(\>\<\!\[CDATA\[)([^\]]+)(\]\])/ value=\2/g" | spath | rename result{@*} as * | eval {name}=value

Kabobgub
Explorer

It seems to be right, but not working.

0 Karma

sideview
SplunkTrust
SplunkTrust

To clarify - the specific XML you posted ends up in a single event, and that event has two fields, both of which have big "multivalue" values of (MISCONF_STATUS.SUCCESS, MISCONF_RISK.HIGH, MISCONF_ALL), and 154,29,606. If you can confirm this then I think I can give you a search language answer.

0 Karma

Kabobgub
Explorer

Almost. Actually it is situated between

<group > 
   <service>
        "this part"
   </service>
</group > 

The rest is right.

0 Karma

markthompson
Builder

I would imagine you can use regex for this.... Should be able to generate a field based on a regular expression.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...