Splunk Search

How to translate mv field extraction from SPL to configuration files?

jmartens
Path Finder

Situation
I am trying to parse events with an unrestricted number of key value pairs  that might also include empty values at some places. I would like to extract the part between the closing parenthesis and opening square bracket as the field name without spaces (but don't want them replaced by underscores)

This is an example of such data:

 

2021-02-24 10:02:31 Local0 Info 10:02:31:346 VARC-DCM-01.ad.maastro.nl MAASTRO\VARC-DCM-01$|80012|DICOM Service VARC_DCM_SCP_SVC_Export 021556/Export requested for object with key: (0008,0008) Image Type [DERIVED] | (0008,0016) SOP Class UID [1.2.840.10008.5.1.4.1.1.481.1] | (0008,0022) Acquisition Date [20210223] | (0008,0023) Content Date [20210223] | (0008,0032) Acquisition Time [184740.207] | (0008,0033) Content Time [184740.208] | (0008,1150) Referenced SOP Class UID [1.2.840.10008.5.1.4.1.1.481.5] | (0020,0013) Instance Number [1] | (300C,0002) Referenced RT Plan Sequence [Mergecom.MCitem] | (300C,0006) Referenced Beam Number [1] | (300E,0002) Approval Status []

 

Working solution using SPL

Using this SPL expression (inspired by the example in this question on multiple field extraction) :

 

| eval backup=_raw
| rex max_match=0 mode=sed "s/(?:(?:\s\|)?\s)\((?<g>[\da-fA-F]{4}),(?<e>[\da-fA-F]{4})\)\s+(?<k>(?:\w+(?:\s*))+)\[(?<v>[^\]]*)\]/\3=\"\4\",/g" 
| rex mode=sed "s/\s//g" 
| extract pairdelim=":," kvdelim="="
| rename backup AS _raw

 

I am able to translate this to my desired outcome:

Image TypeDerived
SOP Class UID1.2.840.10008.5.1.4.1.1.481.1
  
Referenced Beam Number1
Approval Status 

 

Example in SPL (for testing)

Here is a working example to help with testing:

 

| makeresults
| eval _raw="2021-02-24 10:02:31 Local0 Info 10:02:31:346 VARC-DCM-01.ad.maastro.nl MAASTRO\VARC-DCM-01$|80012|DICOM Service VARC_DCM_SCP_SVC_Export 021556/Export requested for object with key: (0008,0008) Image Type [DERIVED] | (0008,0016) SOP Class UID [1.2.840.10008.5.1.4.1.1.481.1] | (0008,0022) Acquisition Date [20210223] | (0008,0023) Content Date [20210223] | (0008,0032) Acquisition Time [184740.207] | (0008,0033) Content Time [184740.208] | (0008,1150) Referenced SOP Class UID [1.2.840.10008.5.1.4.1.1.481.5] | (0020,0013) Instance Number [1] | (300C,0002) Referenced RT Plan Sequence [Mergecom.MCitem] | (300C,0006) Referenced Beam Number [1] | (300E,0002) Approval Status []"
| eval backup=_raw
| rex max_match=0 mode=sed "s/(?:(?:\s\|)?\s)\((?<g>[\da-fA-F]{4}),(?<e>[\da-fA-F]{4})\)\s+(?<k>(?:\w+(?:\s*))+)\[(?<v>[^\]]*)\]/\3=\"\4\",/g" 
| rex mode=sed "s/\s//g" 
| extract pairdelim=":," kvdelim="="
| rename backup AS _raw

 

Question

Now I would like to transfer this to configuration files but I am unsure what to add where. 


I am guessing the regular expression goes in to tokenizer.conf based on this post but not sure when combined with the sed command.

Normally SED commands I would put the SED commands into the transforms.conf file but how do I prevent them from applying to all evens? The events like the one processed in the example is only a subset of the events in the index and sourcetypes in there.
The pairdelim and kvdelim are overrides to the default ones from the sourcetype configuration, not sure where to put this either.

Can someone guide me here? Is there some sort of sequence I can configure like the one in SPL to apply to  specific events? How would I go about filtering out these events?

Labels (1)
0 Karma

kmorris_splunk
Splunk Employee
Splunk Employee

Take a look at my response on this post.  

https://community.splunk.com/t5/Splunk-Search/To-Rex-or-not-to-rex/m-p/459516#M129688

I used this on a source where the Key Value pairs was not consistent.  This should allow you to dynamically extract the Key Values from whatever gets

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...