Getting Data In

How can I parse XML with multivalue fields?

wcooper003
Communicator

Here's a small snippet of an xml firewall event i'm trying to parse:

<response status="success">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>

Ideally i'd like to set up a process to extract the two entries above as separate fields (Temp_Ocelot=36.0, Temp_Switch=37.5). I know I can do this with xpath at search time pretty easily as:

..... | xpath outfield=Temp_Ocelot "//response/result/thermal/Slot1/entry[description='Temperature @ Ocelot']/DegreesC"

But i'd like to define this in the configuration files to parse out the fields automatically. For instance, here's how I set up a props.conf to extract the XML generically so that it extracts all possible fields:

 [pa_env]
 DATETIME_CONFIG = CURRENT
 KV_MODE = xml
 LINE_BREAKER = (<response>)
 MUST_BREAK_AFTER = \</response\>
 NO_BINARY_CHECK = 1
 SHOULD_LINEMERGE = false
 TRUNCATE = 0
 pulldown_type = 1

But this leads to a lot of multivalue records, which I then have to deal with through mvzip, mvexpand, etc.

Is there a way to set up props.conf (or additionally transforms.conf) to extract the individual tags of interest as individual fields? At first I thought I could do something with the FIELDALIAS in props.conf to extract a specific entry description following how it's done in xpath, but that didn't work. Here's what I tried:

 FIELDALIAS-rootfields =  response.result.thermal.Slot1.entry[description='Temperature @ Ocelot'].DegreesC as Temp_Ocelot

Is there a way to specify a specific tag based on its properties in a FIELDALIAS?

Thanks

0 Karma
1 Solution

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"

View solution in original post

0 Karma

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for that, I think these should be stable but will have to check.

Note - the actual raw data doesn't come in formmated with return characters, so I had to modify the regex. Do you see any issues with how I did it below? I'm a regex noob.

| gentimes start=-1 
| eval _raw="<response status='success'><result>  <thermal>    <Slot1>      <entry>        <slot>1</slot>        <description>Temperature @ Ocelot</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>36.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>37.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>42.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Intel PHY</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>35.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>62.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>47.0</DegreesC>      </entry>    </Slot1>  </thermal>  </result> </response>" 
| table _raw 
| rex field=_raw "Temperature @ Ocelot<\/description>\s+(<\w+>[\w\d.]+<\/\w+>\s+){3}<DegreesC>(?<Temp_Ocelot>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for your help, it's working good after I added to the props.conf.

0 Karma

somesoni2
Revered Legend

Looks good to me. (and more importantly works too)

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...