Getting Data In

help with xml input

a212830
Champion

HI,

I need some help with an xml input being sent via a tcp port.

Here's an example of the input:

<version>0003</version>
<step>5</step>
<lastupdate>1418232909</lastupdate>
<ds>
    <name>memory_total_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>268397836</last_ds>
    <value>1181295095.6967</value>
    <unknown_sec>0</unknown_sec>
</ds>
<ds>
    <name>memory_free_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>248575516</last_ds>
    <value>1094051436.2458</value>
    <unknown_sec>0</unknown_sec>
</ds>

I'd like to do the following: 1) create events from opening ds tag to closing ds tag. 2) Use the timestamp that is provided to be used by all events, until the next one is sent. 3)Anything that isn't encapsulated in a tag should be thrown out.

Is this possible? The information is coming in every minute, and is actually quite large -the sample above is only a snippet.

Tags (2)
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

There will be an issue breaking up events at the < ds > field, as it doesnt have timestamps associated within those events. Splunk really wants to see the timestamp within each event to assign the timestamp properly. Refer to : http://docs.splunk.com/Documentation/Splunk/6.2.1/Data/Configuretimestamprecognition#The_timestamp_p...

That being said, you can try something like the below and it might work if the events are consecutive..

For your sourcetype you can try setting something like this:

Props.conf....

[mysourcetype] 
  SHOULD_LINEMERGE = TRUE
  KV_MODE = xml
  TRUNCATE = 999999
  BREAK_ONLY_BEFORE = \<ds\>
  TIME_PREFIX = \<lastupdate\>

You will need to modify that a bit I think, but it should break events at each < ds > event and then extract key value pairs based on the XML.

Share the results.

Thanks

0 Karma

a212830
Champion

Thanks. Unfortunately, it didn't work - it's creating one huge event.

0 Karma

a212830
Champion

Update: The data format above is copied and formatted via xml tidy. The incoming data is one big stream..

000351418232909memory_total_kibGAUGE300.00000.0Infinity2683978361181295095.69670memory_free_kibGAUGE300.00000.0Infinity2485755161094051436.24580

I found that if I tested against the XML tidy file, it worked (pretty much, anyway), but the non-formatted version still see it as one big event.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Splunk looks at the raw data, as its coming it. And its not coming in as XML formatted in your case which will make it difficult. You'll most likely need alot of regex magic, or re-evaluate a method for getting this into Splunk.
If you can dump the xml formatted data to a log file on a system, and then read it in with a Splunk file input, that would be best way to approach this. And them you arent subject to loss of in-flight data in the case Splunk is restarted / down..

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.