Getting Data In

help with xml input

a212830
Champion

HI,

I need some help with an xml input being sent via a tcp port.

Here's an example of the input:

<version>0003</version>
<step>5</step>
<lastupdate>1418232909</lastupdate>
<ds>
    <name>memory_total_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>268397836</last_ds>
    <value>1181295095.6967</value>
    <unknown_sec>0</unknown_sec>
</ds>
<ds>
    <name>memory_free_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>248575516</last_ds>
    <value>1094051436.2458</value>
    <unknown_sec>0</unknown_sec>
</ds>

I'd like to do the following: 1) create events from opening ds tag to closing ds tag. 2) Use the timestamp that is provided to be used by all events, until the next one is sent. 3)Anything that isn't encapsulated in a tag should be thrown out.

Is this possible? The information is coming in every minute, and is actually quite large -the sample above is only a snippet.

Tags (2)
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

There will be an issue breaking up events at the < ds > field, as it doesnt have timestamps associated within those events. Splunk really wants to see the timestamp within each event to assign the timestamp properly. Refer to : http://docs.splunk.com/Documentation/Splunk/6.2.1/Data/Configuretimestamprecognition#The_timestamp_p...

That being said, you can try something like the below and it might work if the events are consecutive..

For your sourcetype you can try setting something like this:

Props.conf....

[mysourcetype] 
  SHOULD_LINEMERGE = TRUE
  KV_MODE = xml
  TRUNCATE = 999999
  BREAK_ONLY_BEFORE = \<ds\>
  TIME_PREFIX = \<lastupdate\>

You will need to modify that a bit I think, but it should break events at each < ds > event and then extract key value pairs based on the XML.

Share the results.

Thanks

0 Karma

a212830
Champion

Thanks. Unfortunately, it didn't work - it's creating one huge event.

0 Karma

a212830
Champion

Update: The data format above is copied and formatted via xml tidy. The incoming data is one big stream..

000351418232909memory_total_kibGAUGE300.00000.0Infinity2683978361181295095.69670memory_free_kibGAUGE300.00000.0Infinity2485755161094051436.24580

I found that if I tested against the XML tidy file, it worked (pretty much, anyway), but the non-formatted version still see it as one big event.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Splunk looks at the raw data, as its coming it. And its not coming in as XML formatted in your case which will make it difficult. You'll most likely need alot of regex magic, or re-evaluate a method for getting this into Splunk.
If you can dump the xml formatted data to a log file on a system, and then read it in with a Splunk file input, that would be best way to approach this. And them you arent subject to loss of in-flight data in the case Splunk is restarted / down..

0 Karma
Get Updates on the Splunk Community!

App Platform's 2025 Year in Review: A Year of Innovation, Growth, and Community

As we step into 2026, it’s the perfect moment to reflect on what an extraordinary year 2025 was for the Splunk ...

Operationalizing Entity Risk Score with Enterprise Security 8.3+

Overview Enterprise Security 8.3 introduces a powerful new feature called “Entity Risk Scoring” (ERS) for ...

Unlock Database Monitoring with Splunk Observability Cloud

  In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and ...