Hello everyone,
I´m new to splunk and I´m facing some problems to get my data onboarded.
In this case i´m trying to observe an xml-logfile in this format:
<?xml version="1.0" encoding="UTF-8"?>
<logstore type="de.123.456.logstore.ABCLogstore" name="Horst" path="/mnt/111/Data/01_Driver/Horst" status="sync end">
<properties>
<property name="autosync" value="true" />
<property name="synctime" value="1440" />
<property name="lastsync" value="2019-12-09T11:16:12" />
<property name="nextsync" value="2019-12-10T11:16:28" />
</properties>
</logstore>
I´m trying to extract some fields which are stored as attributes and the corresponding values which are also stored as attributes.
Example:
<property name="nextsync" value="2019-12-10T11:16:28" />
Field: nextsync
Value: 2019-12-10T11:16:28
and so on.
Does anyone have an idea how to do this?
I tryed this, but in this case I´m getting the name and the values seperated:
[Logstore_XML]
DATETIME_CONFIG = CURRENT
KV_MODE = xml
LINE_BREAKER = (<logstore>)
MUST_BREAK_AFTER = \</logstore\>
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 0
pulldown_type = 1
FIELDALIAS-rootfields = logstore.properties.property{@name} as PropertyID logstore.properties.property as ProperyValue
Thanks
Chris
| makeresults
| eval _raw="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<logstore type=\"de.123.456.logstore.ABCLogstore\" name=\"Horst\" path=\"/mnt/111/Data/01_Driver/Horst\" status=\"sync end\">
<properties>
<property name=\"autosync\" value=\"true\" />
<property name=\"synctime\" value=\"1440\" />
<property name=\"lastsync\" value=\"2019-12-09T11:16:12\" />
<property name=\"nextsync\" value=\"2019-12-10T11:16:28\" />
</properties>
</logstore>"
| spath
| fields - _*
| transpose 0
| eval column=replace(mvindex(split(column,"."),-1),"{@(\w+)}","_\1")
| transpose 0 header_field=column
| fields - column
| mvexpand property_value
| streamstats count
| eval property_name=mvindex(property_name,count - 1)
| table logstore_type, logstore_name, logstore_path, logstore_status, property_*
spath
is useful.
Hey,
Automatic XML extraction will work if the input XML is in the format of
<field>value<field>
This is the reason why KV_MODE=xml does not parse as we expect.
You have multiple options to parse the xml.
Please TRY and let us know.
Method1)You could parse it at the search time by using a regex like below.
| makeresults
| eval _raw="
<properties>
<property name=\"autosync\" value=\"true\" />
<property name=\"synctime\" value=\"1440\" />
<property name=\"lastsync\" value=\"2019-12-09T11:16:12\" />
<property name=\"nextsync\" value=\"2019-12-10T11:16:28\" />
</properties>
</logstore>"
|rex field=_raw autosync\"\svalue=\"(?P<auto_sync>.+)\"\s\/\>
|rex field=_raw synctime\"\svalue=\"(?P<sync_time>.+)\"\s\/\>
|rex field=_raw lastsync\"\svalue=\"(?P<last_sync>.+)\"\s\/\>
|rex field=_raw nextsync\"\svalue=\"(?P<next_sync>.+)\"\s\/\>
|table auto_sync,sync_time,last_sync,next_sync
Method2)Use spath and multikv to extract the values
Method 3)If you want to extract it at the time of indexing modify the property file as below.The transforms will extract the field and value. Modify the regex if the format is different.
**props.conf**
[Logstore_XML]
DATETIME_CONFIG = CURRENT
KV_MODE = xml
LINE_BREAKER = (<logstore>)
MUST_BREAK_AFTER = \</logstore\>
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 0
pulldown_type = 1
FIELDALIAS-rootfields = logstore.properties.property{@name} as PropertyID logstore.properties.property as ProperyValue
TRANSFORMS-extract_xml = extract_xml
**transforms.conf**
[extract_xml]
regex=property\sname="(.+)"\svalue(.+)\s\/>
format=$1:$2
Hi dindu,
Thanks for your quick response!
I´m traying Method 3, but I´m struggling creating the transforms.conf. I`ve created the transforms.conf in this folder:
/opt/splunk/etc/apps/myApp/local/transforms.conf
but it dosen´t seams to work. Also I can not see the transformation in the Splunk GUI under the Menueentry Fieldtransformations.
Is it right that in the format is only one colon? Or should there two colos? ::
**transforms.conf**
[extract_xml]
regex=property\sname="(.+)"\svalue(.+)\s\/>
format=$1::$2
Hi,
There should be two colons. Sorry, it was a typo error.
Try the regex in regex101 and see it is outputting the desired results.
Please restart once making the changes.
Hi,
I still have an issue here. So I decided to delete the sourcetype and the fieldextraction an the monitored inputs.
Then I created (per Webinterface) the fieldextraction and called it: xml_extract with this as entry `property\sname="(.+)"\svalue="(.+)"\s\/>
$1::$2`
Then I set up the monitored input and created the sourcetype with the settings above. But the fields are still not extracted....