Getting Data In

How can I parse XML with multivalue fields?

wcooper003
Communicator

Here's a small snippet of an xml firewall event i'm trying to parse:

<response status="success">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>

Ideally i'd like to set up a process to extract the two entries above as separate fields (Temp_Ocelot=36.0, Temp_Switch=37.5). I know I can do this with xpath at search time pretty easily as:

..... | xpath outfield=Temp_Ocelot "//response/result/thermal/Slot1/entry[description='Temperature @ Ocelot']/DegreesC"

But i'd like to define this in the configuration files to parse out the fields automatically. For instance, here's how I set up a props.conf to extract the XML generically so that it extracts all possible fields:

 [pa_env]
 DATETIME_CONFIG = CURRENT
 KV_MODE = xml
 LINE_BREAKER = (<response>)
 MUST_BREAK_AFTER = \</response\>
 NO_BINARY_CHECK = 1
 SHOULD_LINEMERGE = false
 TRUNCATE = 0
 pulldown_type = 1

But this leads to a lot of multivalue records, which I then have to deal with through mvzip, mvexpand, etc.

Is there a way to set up props.conf (or additionally transforms.conf) to extract the individual tags of interest as individual fields? At first I thought I could do something with the FIELDALIAS in props.conf to extract a specific entry description following how it's done in xpath, but that didn't work. Here's what I tried:

 FIELDALIAS-rootfields =  response.result.thermal.Slot1.entry[description='Temperature @ Ocelot'].DegreesC as Temp_Ocelot

Is there a way to specify a specific tag based on its properties in a FIELDALIAS?

Thanks

0 Karma
1 Solution

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"

View solution in original post

0 Karma

somesoni2
Revered Legend

Assuming values Ocelot and Switch doesn't change, you can setup search time field extractions for those fields.

props.conf on search head

[pa_env]
EXTRACT-tempOcelot = Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)
EXTRACT-tempSwitch =Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)

See the regex working in following runanywhere sample search.

| gentimes start=-1 | eval _raw="<response status=\"success\">
    <result>
        <thermal>
            <Slot1>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Ocelot</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>36.0</DegreesC>
                </entry>
                <entry>
                    <slot>1</slot>
                    <description>Temperature @ Switch</description>
                    <min>0.0</min>
                    <max>60.0</max>
                    <alarm>False</alarm>
                    <DegreesC>37.5</DegreesC>
                </entry>
            </Slot1>
        </thermal>
    </result>
</response>" | table _raw | rex "Temperature @ Ocelot.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Ocelot>[^\<]+)" | rex "Temperature @ Switch.+[\r\n]+(.+[\r\n]+){3}\s+\<DegreesC\>(?<Temp_Switch>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for that, I think these should be stable but will have to check.

Note - the actual raw data doesn't come in formmated with return characters, so I had to modify the regex. Do you see any issues with how I did it below? I'm a regex noob.

| gentimes start=-1 
| eval _raw="<response status='success'><result>  <thermal>    <Slot1>      <entry>        <slot>1</slot>        <description>Temperature @ Ocelot</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>36.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>37.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>42.5</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Intel PHY</description>        <min>0.0</min>        <max>60.0</max>        <alarm>False</alarm>        <DegreesC>35.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Switch Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>62.0</DegreesC>      </entry>      <entry>        <slot>1</slot>        <description>Temperature @ Cavium Core</description>        <min>0.0</min>        <max>85.0</max>        <alarm>False</alarm>        <DegreesC>47.0</DegreesC>      </entry>    </Slot1>  </thermal>  </result> </response>" 
| table _raw 
| rex field=_raw "Temperature @ Ocelot<\/description>\s+(<\w+>[\w\d.]+<\/\w+>\s+){3}<DegreesC>(?<Temp_Ocelot>[^\<]+)"
0 Karma

wcooper003
Communicator

Thanks for your help, it's working good after I added to the props.conf.

0 Karma

somesoni2
Revered Legend

Looks good to me. (and more importantly works too)

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...