Splunk Search

transform XML with same node name and add field names

jbanAtSplunk
Communicator

Hi,

I have Windows Event for specific application that have payload in Windows Event Log, when using Splunk_TA_windows to extract data will get field with multipe "Data".

<Data>process_name</Data><Data>signature_name</Data><Data>binary_description</Data>

How can I extract it automatically to fields/value:
process_name = process_name
signature = signature_name
binary = binary_description

 

Is there any way without using "big" regex? to just $1:$2:$3.. and then add names to $1, $2, $3 like for CSV.

something like: 

REGEX = (?ms)<Data>(.*?)<\/Data>


this will create maybe one multi value field and then assign Field_name

Labels (1)
0 Karma

glc_slash_it
Path Finder

Hi,

What is the sourcetype applied by splunk? Also can you paste an complete event?

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

 

Maybe to start you could try this on spl:

| rex "<Data>(?<process_name>.*)<\/Data><Data>(?<signature_name>.*)<\/Data><Data>(?<binary_description>.*)<\/Data>"

 

0 Karma

jbanAtSplunk
Communicator

Hey, that SPL is good. But it have 99 Data section and getting Regex backlag errors on Regex101. 

Currently I make it like

[test_xmldata_to_fields]
SOURCE_KEY = EventData_Xml
REGEX = (?ms)<Data>(.*?)<\/Data>
FORMAT = test_data::$1
MV_ADD = 1

And then (dirty one, but it's working for start)
EVAL-t_process_name=mvindex(test_data,0)
EVAL-t_signature_name=mvindex(test_data,1)
EVAL-t_binary_description=mvindex(test_data,2)
 

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

* Yes

 

Sourcetype, I create my own and just using Splunk_TA_Windows for initial report to extract Data_Xml. Basically, it's new Sourcetype and can do transform, props as I like. 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do not treat structured data such as XML as string text is my usual advice.  Splunk's built-in routines designed to process XML (e.g., spath) is much more robust than any regex you can construct.

If you have difficulty with using spath and such, post sample/mock data (anonymize as needed) and explain what search you use and what result you get, how the result is different from your desires.

0 Karma
Get Updates on the Splunk Community!

AppDynamics Summer Webinars

This summer, our mighty AppDynamics team is cooking up some delicious content on YouTube Live to satiate your ...

SOCin’ it to you at Splunk University

Splunk University is expanding its instructor-led learning portfolio with dedicated Security tracks at .conf25 ...

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Organizations handling credit card transactions know that PCI DSS compliance is both critical and complex. The ...