Splunk Search

transform XML with same node name and add field names

jbanAtSplunk
Communicator

Hi,

I have Windows Event for specific application that have payload in Windows Event Log, when using Splunk_TA_windows to extract data will get field with multipe "Data".

<Data>process_name</Data><Data>signature_name</Data><Data>binary_description</Data>

How can I extract it automatically to fields/value:
process_name = process_name
signature = signature_name
binary = binary_description

 

Is there any way without using "big" regex? to just $1:$2:$3.. and then add names to $1, $2, $3 like for CSV.

something like: 

REGEX = (?ms)<Data>(.*?)<\/Data>


this will create maybe one multi value field and then assign Field_name

Labels (1)
0 Karma

glc_slash_it
Path Finder

Hi,

What is the sourcetype applied by splunk? Also can you paste an complete event?

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

 

Maybe to start you could try this on spl:

| rex "<Data>(?<process_name>.*)<\/Data><Data>(?<signature_name>.*)<\/Data><Data>(?<binary_description>.*)<\/Data>"

 

0 Karma

jbanAtSplunk
Communicator

Hey, that SPL is good. But it have 99 Data section and getting Regex backlag errors on Regex101. 

Currently I make it like

[test_xmldata_to_fields]
SOURCE_KEY = EventData_Xml
REGEX = (?ms)<Data>(.*?)<\/Data>
FORMAT = test_data::$1
MV_ADD = 1

And then (dirty one, but it's working for start)
EVAL-t_process_name=mvindex(test_data,0)
EVAL-t_signature_name=mvindex(test_data,1)
EVAL-t_binary_description=mvindex(test_data,2)
 

Regarding the <Data> field, does it always have the same format (process_name, signature_name,binary_description)?

* Yes

 

Sourcetype, I create my own and just using Splunk_TA_Windows for initial report to extract Data_Xml. Basically, it's new Sourcetype and can do transform, props as I like. 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do not treat structured data such as XML as string text is my usual advice.  Splunk's built-in routines designed to process XML (e.g., spath) is much more robust than any regex you can construct.

If you have difficulty with using spath and such, post sample/mock data (anonymize as needed) and explain what search you use and what result you get, how the result is different from your desires.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...