Dashboards & Visualizations

XML Extract multiple attributes as Key and Value

chrkohm
Path Finder

Hello everyone,

I´m new to splunk and I´m facing some problems to get my data onboarded.

In this case i´m trying to observe an xml-logfile in this format:

<?xml version="1.0" encoding="UTF-8"?>
<logstore type="de.123.456.logstore.ABCLogstore" name="Horst" path="/mnt/111/Data/01_Driver/Horst" status="sync end">
  <properties>
    <property name="autosync" value="true" />
    <property name="synctime" value="1440" />
    <property name="lastsync" value="2019-12-09T11:16:12" />
    <property name="nextsync" value="2019-12-10T11:16:28" />
  </properties>
</logstore>

I´m trying to extract some fields which are stored as attributes and the corresponding values which are also stored as attributes.

Example:

<property name="nextsync" value="2019-12-10T11:16:28" />

Field: nextsync
Value: 2019-12-10T11:16:28

and so on.

Does anyone have an idea how to do this?

I tryed this, but in this case I´m getting the name and the values seperated:

[Logstore_XML]
DATETIME_CONFIG = CURRENT
KV_MODE = xml
LINE_BREAKER = (<logstore>)
MUST_BREAK_AFTER = \</logstore\>
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 0
pulldown_type = 1
FIELDALIAS-rootfields = logstore.properties.property{@name} as PropertyID logstore.properties.property as ProperyValue

Thanks
Chris

0 Karma

to4kawa
Ultra Champion
| makeresults 
 | eval _raw="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<logstore type=\"de.123.456.logstore.ABCLogstore\" name=\"Horst\" path=\"/mnt/111/Data/01_Driver/Horst\" status=\"sync end\">
  <properties>
    <property name=\"autosync\" value=\"true\" />
    <property name=\"synctime\" value=\"1440\" />
    <property name=\"lastsync\" value=\"2019-12-09T11:16:12\" />
    <property name=\"nextsync\" value=\"2019-12-10T11:16:28\" />
  </properties>
</logstore>"
| spath
| fields - _*
| transpose 0
| eval column=replace(mvindex(split(column,"."),-1),"{@(\w+)}","_\1")
| transpose 0 header_field=column 
| fields - column
| mvexpand property_value
| streamstats count
| eval property_name=mvindex(property_name,count - 1)
| table logstore_type, logstore_name, logstore_path, logstore_status, property_*

spath is useful.

0 Karma

dindu
Contributor

Hey,

Automatic XML extraction will work if the input XML is in the format of

   <field>value<field>

This is the reason why KV_MODE=xml does not parse as we expect.
You have multiple options to parse the xml.
Please TRY and let us know.

Method1)You could parse it at the search time by using a regex like below.

    | makeresults
    | eval _raw="
       <properties>
         <property name=\"autosync\" value=\"true\" />
         <property name=\"synctime\" value=\"1440\" />
         <property name=\"lastsync\" value=\"2019-12-09T11:16:12\" />
         <property name=\"nextsync\" value=\"2019-12-10T11:16:28\" />
       </properties>
    </logstore>"
    |rex field=_raw autosync\"\svalue=\"(?P<auto_sync>.+)\"\s\/\>
    |rex field=_raw synctime\"\svalue=\"(?P<sync_time>.+)\"\s\/\>
    |rex field=_raw lastsync\"\svalue=\"(?P<last_sync>.+)\"\s\/\>
    |rex field=_raw nextsync\"\svalue=\"(?P<next_sync>.+)\"\s\/\>
    |table auto_sync,sync_time,last_sync,next_sync

Method2)Use spath and multikv to extract the values

Method 3)If you want to extract it at the time of indexing modify the property file as below.The transforms will extract the field and value. Modify the regex if the format is different.

    **props.conf**
    [Logstore_XML]
     DATETIME_CONFIG = CURRENT
     KV_MODE = xml
     LINE_BREAKER = (<logstore>)
     MUST_BREAK_AFTER = \</logstore\>
     NO_BINARY_CHECK = 1
     SHOULD_LINEMERGE = false
     TRUNCATE = 0
     pulldown_type = 1
     FIELDALIAS-rootfields = logstore.properties.property{@name} as PropertyID logstore.properties.property as ProperyValue
     TRANSFORMS-extract_xml = extract_xml

     **transforms.conf**
     [extract_xml]
     regex=property\sname="(.+)"\svalue(.+)\s\/>
     format=$1:$2
0 Karma

chrkohm
Path Finder

Hi dindu,

Thanks for your quick response!

I´m traying Method 3, but I´m struggling creating the transforms.conf. I`ve created the transforms.conf in this folder:

/opt/splunk/etc/apps/myApp/local/transforms.conf

but it dosen´t seams to work. Also I can not see the transformation in the Splunk GUI under the Menueentry Fieldtransformations.

Is it right that in the format is only one colon? Or should there two colos? ::

**transforms.conf**
      [extract_xml]
      regex=property\sname="(.+)"\svalue(.+)\s\/>
      format=$1::$2
0 Karma

dindu
Contributor

Hi,
There should be two colons. Sorry, it was a typo error.
Try the regex in regex101 and see it is outputting the desired results.

Please restart once making the changes.

0 Karma

chrkohm
Path Finder

Hi,

I still have an issue here. So I decided to delete the sourcetype and the fieldextraction an the monitored inputs.

Then I created (per Webinterface) the fieldextraction and called it: xml_extract with this as entry `property\sname="(.+)"\svalue="(.+)"\s\/>

$1::$2`

Then I set up the monitored input and created the sourcetype with the settings above. But the fields are still not extracted....

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...