Solved: How to configure XML data parsing?

dbcase · ‎02-06-2017

Hi,

I've not tried to parse XML data in Splunk so I need a bit of hand holding.... I have the following data that repeats for different sensors. I'd like to be able to extract all the XML fields. I know this needs to be done in props.conf (and maybe transforms.conf) but I don't know where to begin :).... Any help would be GREATLY appreciated! TY!!!

 <!--Crow Sensors Begin-->
    <DeviceDescriptor>
        <uuid>4434D720-A9E7-11E3-9CF2-0002A5D5C51B</uuid>
        <description>Flood Sensor</description>
        <category>zigbee</category>
        <manufacturer>Crow</manufacturer>
        <model>FLOOD-ZB</model>
        <hardwareVersions>0x1C</hardwareVersions>
        <firmwareVersions>0x01000025</firmwareVersions>
        <latestFirmware>
            <version>0x01000025</version>
            <filename>crow-flood-zb-v1.0.25.ota</filename>
            <type>ota</type>
        </latestFirmware>
    </DeviceDescriptor>

    <DeviceDescriptor>
        <uuid>4b931971-bf2a-11e3-b1b6-0800200c9a66</uuid>
        <description>Motion (PIR) Sensor</description>
        <category>zigbee</category>
        <manufacturer>Crow</manufacturer>
        <model>PIR-ZB</model>
        <hardwareVersions>0x1A</hardwareVersions>
        <firmwareVersions>0x01000025</firmwareVersions>
        <latestFirmware>
            <version>0x01000025</version>
            <filename>crow-pir-zb-v1.0.25.ota</filename>
            <type>ota</type>
        </latestFirmware>
    </DeviceDescriptor>

somesoni2 · ‎02-06-2017

Ok. So if the ingestion is done in such a way that each <DeviceDescriptor> is a separate event in Splunk with valid xml syntax, the field extraction is as simple as adding KV_MODE = xml in props.conf on Search Head(s). So, main focus should be getting the data ingested correctly. Since your data doesn't have timestamp, I'm using current time as the _time value for the event.

Try this for props.conf on your Indexer/Heavy Forwarder.

[YourSourceType]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\s*\<DeviceDescriptor\>)
DATETIME_CONFIG = CURRENT

Search head props.conf

[YourSourceType]
KV_MODE =xml

View solution in original post

somesoni2 · ‎02-06-2017

Ok. So if the ingestion is done in such a way that each <DeviceDescriptor> is a separate event in Splunk with valid xml syntax, the field extraction is as simple as adding KV_MODE = xml in props.conf on Search Head(s). So, main focus should be getting the data ingested correctly. Since your data doesn't have timestamp, I'm using current time as the _time value for the event.

Try this for props.conf on your Indexer/Heavy Forwarder.

[YourSourceType]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\s*\<DeviceDescriptor\>)
DATETIME_CONFIG = CURRENT

Search head props.conf

[YourSourceType]
KV_MODE =xml

bestpa · ‎06-21-2019

worked for me but had to modify the linebreaker myself. check your XML file for syntax problems as well.
xmlint --noout filename.xml ; echo $?
If there wasn't an error with your xml syntax, you should see result code 0 and nothing but a blank return.

HMTODD · ‎09-20-2018

I am have a similar problem. I have tried a number of other suggestion solutions. This is the first that states the requirement for indexer / heavy forwarder and search head configurations. I was placing these configurations on the Universal Forwarder where the XML files are being written. Why does the KV_MODE=xml needs to go on the search head? We are using Splunk Cloud so this is not an option for me.

somesoni2 · ‎02-06-2017

Is the data already indexed in Splunk? If yes, then does each of DeviceDescriptor entry is coming as separate event or single event?

dbcase · ‎02-06-2017

Hi Somesoni2!

No, the data has not been indexed as of yet. I can test out a couple if that would be helpful. Let me know, thanks!

How to configure XML data parsing?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Splunk Developers: Construct Your Future at the .conf26 Builder Bar

Quick connection discovery mode for forwarders

Join the Conversation