Getting Data In

How can I parse XML with multivalue fields?

wcooper003
Communicator

Here's a small snippet of an xml firewall event i'm trying to parse:

<response status="success">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>

Ideally i'd like to set up a process to extract the two entries above as separate fields (Temp_Ocelot=36.0, Temp_Switch=37.5). I know I can do this with xpath at search time pretty easily as:

..... | xpath outfield=Temp_Ocelot "//response/result/thermal/Slot1/entry[description='Temperature @ Ocelot']/DegreesC"

But i'd like to define this in the configuration files to parse out the fields automatically. For instance, here's how I set up a props.conf to extract the XML generically so that it extracts all possible fields:

 [pa_env]
 DATETIME_CONFIG = CURRENT
 KV_MODE = xml
 LINE_BREAKER = (<response>)
 MUST_BREAK_AFTER = \</response\>
 NO_BINARY_CHECK = 1
 SHOULD_LINEMERGE = false
 TRUNCATE = 0
 pulldown_type = 1

But this leads to a lot of multivalue records, which I then have to deal with through mvzip, mvexpand, etc.

Is there a way to set up props.conf (or additionally transforms.conf) to extract the individual tags of interest as individual fields? At first I thought I could do something with the FIELDALIAS in props.conf to extract a specific entry description following how it's done in xpath, but that didn't work. Here's what I tried:

 FIELDALIAS-rootfields =  response.result.thermal.Slot1.entry[description='Temperature @ Ocelot'].DegreesC as Temp_Ocelot

Is there a way to specify a specific tag based on its properties in a FIELDALIAS?

Thanks

0 Karma
1 Solution

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"

View solution in original post

0 Karma

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for that, I think these should be stable but will have to check.

Note - the actual raw data doesn't come in formmated with return characters, so I had to modify the regex. Do you see any issues with how I did it below? I'm a regex noob.

| gentimes start=-1 
| eval _raw="<response status='success'><result>  <thermal>    <Slot1>      <entry>        <slot>1</slot>        <description>Temperature @ Ocelot</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>36.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>37.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>42.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Intel PHY</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>35.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>62.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>47.0</DegreesC>      </entry>    </Slot1>  </thermal>  </result> </response>" 
| table _raw 
| rex field=_raw "Temperature @ Ocelot<\/description>\s+(<\w+>[\w\d.]+<\/\w+>\s+){3}<DegreesC>(?<Temp_Ocelot>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for your help, it's working good after I added to the props.conf.

0 Karma

somesoni2
Revered Legend

Looks good to me. (and more importantly works too)

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...