Getting Data In

help with xml input

a212830
Champion

HI,

I need some help with an xml input being sent via a tcp port.

Here's an example of the input:

<version>0003</version>
<step>5</step>
<lastupdate>1418232909</lastupdate>
<ds>
    <name>memory_total_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>268397836</last_ds>
    <value>1181295095.6967</value>
    <unknown_sec>0</unknown_sec>
</ds>
<ds>
    <name>memory_free_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>248575516</last_ds>
    <value>1094051436.2458</value>
    <unknown_sec>0</unknown_sec>
</ds>

I'd like to do the following: 1) create events from opening ds tag to closing ds tag. 2) Use the timestamp that is provided to be used by all events, until the next one is sent. 3)Anything that isn't encapsulated in a tag should be thrown out.

Is this possible? The information is coming in every minute, and is actually quite large -the sample above is only a snippet.

Tags (2)
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

There will be an issue breaking up events at the < ds > field, as it doesnt have timestamps associated within those events. Splunk really wants to see the timestamp within each event to assign the timestamp properly. Refer to : http://docs.splunk.com/Documentation/Splunk/6.2.1/Data/Configuretimestamprecognition#The_timestamp_p...

That being said, you can try something like the below and it might work if the events are consecutive..

For your sourcetype you can try setting something like this:

Props.conf....

[mysourcetype] 
  SHOULD_LINEMERGE = TRUE
  KV_MODE = xml
  TRUNCATE = 999999
  BREAK_ONLY_BEFORE = \<ds\>
  TIME_PREFIX = \<lastupdate\>

You will need to modify that a bit I think, but it should break events at each < ds > event and then extract key value pairs based on the XML.

Share the results.

Thanks

0 Karma

a212830
Champion

Thanks. Unfortunately, it didn't work - it's creating one huge event.

0 Karma

a212830
Champion

Update: The data format above is copied and formatted via xml tidy. The incoming data is one big stream..

000351418232909memory_total_kibGAUGE300.00000.0Infinity2683978361181295095.69670memory_free_kibGAUGE300.00000.0Infinity2485755161094051436.24580

I found that if I tested against the XML tidy file, it worked (pretty much, anyway), but the non-formatted version still see it as one big event.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Splunk looks at the raw data, as its coming it. And its not coming in as XML formatted in your case which will make it difficult. You'll most likely need alot of regex magic, or re-evaluate a method for getting this into Splunk.
If you can dump the xml formatted data to a log file on a system, and then read it in with a Splunk file input, that would be best way to approach this. And them you arent subject to loss of in-flight data in the case Splunk is restarted / down..

0 Karma
Get Updates on the Splunk Community!

Security Professional: Sharpen Your Defenses with These .conf25 Sessions

Sooooooooooo, guess what. .conf25 is almost here, and if you're on the Security Learning Path, this is your ...

First Steps with Splunk SOAR

Our first step was to gather a list of the playbooks we wanted and to sort them by priority.  Once this list ...

How To Build a Self-Service Observability Practice with Splunk Observability Cloud

If you’ve read our previous post on self-service observability, you already know what it is and why it ...