Splunk Search

transform XML with same node name and add field names

jbanAtSplunk
Communicator

Hi,

I have Windows Event for specific application that have payload in Windows Event Log, when using Splunk_TA_windows to extract data will get field with multipe "Data".

<Data>process_name</Data><Data>signature_name</Data><Data>binary_description</Data>

How can I extract it automatically to fields/value:
process_name = process_name
signature = signature_name
binary = binary_description

 

Is there any way without using "big" regex? to just $1:$2:$3.. and then add names to $1, $2, $3 like for CSV.

something like: 

REGEX = (?ms)<Data>(.*?)<\/Data>


this will create maybe one multi value field and then assign Field_name

Labels (1)
0 Karma

glc_slash_it
Path Finder

Hi,

What is the sourcetype applied by splunk? Also can you paste an complete event?

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

 

Maybe to start you could try this on spl:

| rex "<Data>(?<process_name>.*)<\/Data><Data>(?<signature_name>.*)<\/Data><Data>(?<binary_description>.*)<\/Data>"

 

0 Karma

jbanAtSplunk
Communicator

Hey, that SPL is good. But it have 99 Data section and getting Regex backlag errors on Regex101. 

Currently I make it like

[test_xmldata_to_fields]
SOURCE_KEY = EventData_Xml
REGEX = (?ms)<Data>(.*?)<\/Data>
FORMAT = test_data::$1
MV_ADD = 1

And then (dirty one, but it's working for start)
EVAL-t_process_name=mvindex(test_data,0)
EVAL-t_signature_name=mvindex(test_data,1)
EVAL-t_binary_description=mvindex(test_data,2)
 

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

* Yes

 

Sourcetype, I create my own and just using Splunk_TA_Windows for initial report to extract Data_Xml. Basically, it's new Sourcetype and can do transform, props as I like. 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do not treat structured data such as XML as string text is my usual advice.  Splunk's built-in routines designed to process XML (e.g., spath) is much more robust than any regex you can construct.

If you have difficulty with using spath and such, post sample/mock data (anonymize as needed) and explain what search you use and what result you get, how the result is different from your desires.

0 Karma
Get Updates on the Splunk Community!

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Your Next Big Security Credential: No Prerequisites Needed We know you’ve got the skills, and now, earning the ...

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

This is the sixth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Answers Content Calendar, July Edition I

Hello Community! Welcome to another month of Community Content Calendar series! For the month of July, we will ...