Splunk Search

transform XML with same node name and add field names

jbanAtSplunk
Communicator

Hi,

I have Windows Event for specific application that have payload in Windows Event Log, when using Splunk_TA_windows to extract data will get field with multipe "Data".

<Data>process_name</Data><Data>signature_name</Data><Data>binary_description</Data>

How can I extract it automatically to fields/value:
process_name = process_name
signature = signature_name
binary = binary_description

 

Is there any way without using "big" regex? to just $1:$2:$3.. and then add names to $1, $2, $3 like for CSV.

something like: 

REGEX = (?ms)<Data>(.*?)<\/Data>


this will create maybe one multi value field and then assign Field_name

Labels (1)
0 Karma

glc_slash_it
Path Finder

Hi,

What is the sourcetype applied by splunk? Also can you paste an complete event?

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

 

Maybe to start you could try this on spl:

| rex "<Data>(?<process_name>.*)<\/Data><Data>(?<signature_name>.*)<\/Data><Data>(?<binary_description>.*)<\/Data>"

 

0 Karma

jbanAtSplunk
Communicator

Hey, that SPL is good. But it have 99 Data section and getting Regex backlag errors on Regex101. 

Currently I make it like

[test_xmldata_to_fields]
SOURCE_KEY = EventData_Xml
REGEX = (?ms)<Data>(.*?)<\/Data>
FORMAT = test_data::$1
MV_ADD = 1

And then (dirty one, but it's working for start)
EVAL-t_process_name=mvindex(test_data,0)
EVAL-t_signature_name=mvindex(test_data,1)
EVAL-t_binary_description=mvindex(test_data,2)
 

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

* Yes

 

Sourcetype, I create my own and just using Splunk_TA_Windows for initial report to extract Data_Xml. Basically, it's new Sourcetype and can do transform, props as I like. 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do not treat structured data such as XML as string text is my usual advice.  Splunk's built-in routines designed to process XML (e.g., spath) is much more robust than any regex you can construct.

If you have difficulty with using spath and such, post sample/mock data (anonymize as needed) and explain what search you use and what result you get, how the result is different from your desires.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

What You Read The Most: Splunk Lantern’s Most Popular Articles!

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...