Getting Data In

How can I parse XML with multivalue fields?

wcooper003
Communicator

Here's a small snippet of an xml firewall event i'm trying to parse:

<response status="success">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>

Ideally i'd like to set up a process to extract the two entries above as separate fields (Temp_Ocelot=36.0, Temp_Switch=37.5). I know I can do this with xpath at search time pretty easily as:

..... | xpath outfield=Temp_Ocelot "//response/result/thermal/Slot1/entry[description='Temperature @ Ocelot']/DegreesC"

But i'd like to define this in the configuration files to parse out the fields automatically. For instance, here's how I set up a props.conf to extract the XML generically so that it extracts all possible fields:

 [pa_env]
 DATETIME_CONFIG = CURRENT
 KV_MODE = xml
 LINE_BREAKER = (<response>)
 MUST_BREAK_AFTER = \</response\>
 NO_BINARY_CHECK = 1
 SHOULD_LINEMERGE = false
 TRUNCATE = 0
 pulldown_type = 1

But this leads to a lot of multivalue records, which I then have to deal with through mvzip, mvexpand, etc.

Is there a way to set up props.conf (or additionally transforms.conf) to extract the individual tags of interest as individual fields? At first I thought I could do something with the FIELDALIAS in props.conf to extract a specific entry description following how it's done in xpath, but that didn't work. Here's what I tried:

 FIELDALIAS-rootfields =  response.result.thermal.Slot1.entry[description='Temperature @ Ocelot'].DegreesC as Temp_Ocelot

Is there a way to specify a specific tag based on its properties in a FIELDALIAS?

Thanks

0 Karma
1 Solution

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"

View solution in original post

0 Karma

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for that, I think these should be stable but will have to check.

Note - the actual raw data doesn't come in formmated with return characters, so I had to modify the regex. Do you see any issues with how I did it below? I'm a regex noob.

| gentimes start=-1 
| eval _raw="<response status='success'><result>  <thermal>    <Slot1>      <entry>        <slot>1</slot>        <description>Temperature @ Ocelot</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>36.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>37.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>42.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Intel PHY</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>35.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>62.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>47.0</DegreesC>      </entry>    </Slot1>  </thermal>  </result> </response>" 
| table _raw 
| rex field=_raw "Temperature @ Ocelot<\/description>\s+(<\w+>[\w\d.]+<\/\w+>\s+){3}<DegreesC>(?<Temp_Ocelot>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for your help, it's working good after I added to the props.conf.

0 Karma

somesoni2
Revered Legend

Looks good to me. (and more importantly works too)

0 Karma
Get Updates on the Splunk Community!

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

The Defining Technology Movement of Our Lifetime The advent of agentic AI is arguably the defining technology ...

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

In today’s complex digital landscape, security teams face increasing pressure to protect sprawling data across ...

Your summer travels continue with new course releases

Summer in the Northern hemisphere is in full swing, and is often a time to travel and explore. If your summer ...