Getting Data In

help with xml input

a212830
Champion

HI,

I need some help with an xml input being sent via a tcp port.

Here's an example of the input:

<version>0003</version>
<step>5</step>
<lastupdate>1418232909</lastupdate>
<ds>
    <name>memory_total_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>268397836</last_ds>
    <value>1181295095.6967</value>
    <unknown_sec>0</unknown_sec>
</ds>
<ds>
    <name>memory_free_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>248575516</last_ds>
    <value>1094051436.2458</value>
    <unknown_sec>0</unknown_sec>
</ds>

I'd like to do the following: 1) create events from opening ds tag to closing ds tag. 2) Use the timestamp that is provided to be used by all events, until the next one is sent. 3)Anything that isn't encapsulated in a tag should be thrown out.

Is this possible? The information is coming in every minute, and is actually quite large -the sample above is only a snippet.

Tags (2)
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

There will be an issue breaking up events at the < ds > field, as it doesnt have timestamps associated within those events. Splunk really wants to see the timestamp within each event to assign the timestamp properly. Refer to : http://docs.splunk.com/Documentation/Splunk/6.2.1/Data/Configuretimestamprecognition#The_timestamp_p...

That being said, you can try something like the below and it might work if the events are consecutive..

For your sourcetype you can try setting something like this:

Props.conf....

[mysourcetype] 
  SHOULD_LINEMERGE = TRUE
  KV_MODE = xml
  TRUNCATE = 999999
  BREAK_ONLY_BEFORE = \<ds\>
  TIME_PREFIX = \<lastupdate\>

You will need to modify that a bit I think, but it should break events at each < ds > event and then extract key value pairs based on the XML.

Share the results.

Thanks

0 Karma

a212830
Champion

Thanks. Unfortunately, it didn't work - it's creating one huge event.

0 Karma

a212830
Champion

Update: The data format above is copied and formatted via xml tidy. The incoming data is one big stream..

000351418232909memory_total_kibGAUGE300.00000.0Infinity2683978361181295095.69670memory_free_kibGAUGE300.00000.0Infinity2485755161094051436.24580

I found that if I tested against the XML tidy file, it worked (pretty much, anyway), but the non-formatted version still see it as one big event.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Splunk looks at the raw data, as its coming it. And its not coming in as XML formatted in your case which will make it difficult. You'll most likely need alot of regex magic, or re-evaluate a method for getting this into Splunk.
If you can dump the xml formatted data to a log file on a system, and then read it in with a Splunk file input, that would be best way to approach this. And them you arent subject to loss of in-flight data in the case Splunk is restarted / down..

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Event Series: Telemetry Pipeline Management

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...