Getting Data In

How can I parse XML with multivalue fields?

wcooper003
Communicator

Here's a small snippet of an xml firewall event i'm trying to parse:

<response status="success">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>

Ideally i'd like to set up a process to extract the two entries above as separate fields (Temp_Ocelot=36.0, Temp_Switch=37.5). I know I can do this with xpath at search time pretty easily as:

..... | xpath outfield=Temp_Ocelot "//response/result/thermal/Slot1/entry[description='Temperature @ Ocelot']/DegreesC"

But i'd like to define this in the configuration files to parse out the fields automatically. For instance, here's how I set up a props.conf to extract the XML generically so that it extracts all possible fields:

 [pa_env]
 DATETIME_CONFIG = CURRENT
 KV_MODE = xml
 LINE_BREAKER = (<response>)
 MUST_BREAK_AFTER = \</response\>
 NO_BINARY_CHECK = 1
 SHOULD_LINEMERGE = false
 TRUNCATE = 0
 pulldown_type = 1

But this leads to a lot of multivalue records, which I then have to deal with through mvzip, mvexpand, etc.

Is there a way to set up props.conf (or additionally transforms.conf) to extract the individual tags of interest as individual fields? At first I thought I could do something with the FIELDALIAS in props.conf to extract a specific entry description following how it's done in xpath, but that didn't work. Here's what I tried:

 FIELDALIAS-rootfields =  response.result.thermal.Slot1.entry[description='Temperature @ Ocelot'].DegreesC as Temp_Ocelot

Is there a way to specify a specific tag based on its properties in a FIELDALIAS?

Thanks

0 Karma
1 Solution

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"

View solution in original post

0 Karma

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for that, I think these should be stable but will have to check.

Note - the actual raw data doesn't come in formmated with return characters, so I had to modify the regex. Do you see any issues with how I did it below? I'm a regex noob.

| gentimes start=-1 
| eval _raw="<response status='success'><result>  <thermal>    <Slot1>      <entry>        <slot>1</slot>        <description>Temperature @ Ocelot</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>36.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>37.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>42.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Intel PHY</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>35.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>62.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>47.0</DegreesC>      </entry>    </Slot1>  </thermal>  </result> </response>" 
| table _raw 
| rex field=_raw "Temperature @ Ocelot<\/description>\s+(<\w+>[\w\d.]+<\/\w+>\s+){3}<DegreesC>(?<Temp_Ocelot>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for your help, it's working good after I added to the props.conf.

0 Karma

somesoni2
Revered Legend

Looks good to me. (and more importantly works too)

0 Karma
Get Updates on the Splunk Community!

Index This | How many sides does a circle have?

  March 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

New This Month - Splunk Observability updates and improvements for faster ...

What’s New? This month, we’re delivering several enhancements across Splunk Observability Cloud for faster and ...

What's New in Splunk Cloud Platform 9.3.2411?

Hey Splunky People! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2411. This release ...