Splunk Enterprise

Field extraction

mbasharat
Builder

Hi,

I have a need for field extraction. I have a sourcetype that has compliance related information for our use case. This data has field name "Text". This field has data coming in variations. Below are two of the many variations. I need the extraction via regex that can detect fields within tags and parse them out. Data cardinality will be by:

 

 

<cm:compliance-check-id>36c4d07cc410439bf3bf79f7f5942672</cm:compliance-check-id>

 

 

Sample: 1

 

 

<cm:compliance-result>WARNING</cm:compliance-result> 
<cm:compliance-actual-value>Error -- evaluation period has ended</cm:compliance-actual-value> 
<cm:compliance-check-id>36c4d07cc410439bf3bf79f7f5942672</cm:compliance-check-id> 
<cm:compliance-policy-value>WARNING</cm:compliance-policy-value> 
<cm:compliance-check-name>Connection error</cm:compliance-check-name>

 

 

Sample: 2

 

 

<compliance>true</compliance> 
<cm:compliance-check-name>WN10-00-000005 - Domain-joined systems must use Windows 10 Enterprise Edition 64-bit version - 64-bit</cm:compliance-check-name> 
<cm:compliance-audit-file>DISA_STIG_Windows_10_v1r20.audit</cm:compliance-audit-file> 
<cm:compliance-check-id>55aeff4f26d6b8307f6f9672750a5548</cm:compliance-check-id> 
<cm:compliance-actual-value>'64-bit'</cm:compliance-actual-value> 
<cm:compliance-policy-value>'64-bit'</cm:compliance-policy-value> 
<cm:compliance-info> Features such as Credential Guard use virtualization based security to protect information that could be used in credential theft attacks if compromised. There are a number of system requirements that must be met in order for Credential Guard to be configured and enabled properly. Virtualization based security and Credential Guard are only available with Windows 10 Enterprise 64-bit version. </cm:compliance-info> 
<cm:compliance-result>PASSED</cm:compliance-result> 
<cm:compliance-reference>800-171|3.4.1,800-53|CM-8,CAT|II,CCI|CCI-000366,CN-L3|8.1.10.2(a),CN-L3|8.1.10.2(b),CSF|DE.CM-7,CSF|ID.AM-1,CSF|ID.AM-2,CSF|PR.DS-3,ISO/IEC-27001|A.8.1.1,ITSG-33|CM-8,NESA|T1.2.1,NESA|T1.2.2,NIAv2|NS35,Rule-ID|SV-77809r3_rule,STIG-ID|WN10-00-000005,Vuln-ID|V-63319</cm:compliance-reference> 
<cm:compliance-see-also>https://dl.dod.cyber.mil/wp-content/uploads/stigs/zip/U_MS_Windows_10_V1R20_STIG.zip</cm:compliance-see-also>

 

 

Thanks in-advance!!!

0 Karma
1 Solution

yeahnah
Motivator

Hi @mbasharat 

Ah OK.   Have you looked at the xpath command then.   It should automatically be able to do this for you.

https://docs.splunk.com/Documentation/Splunk/8.0.6/SearchReference/Xpath

Otherwise, using transforms.conf and props.conf configuration can be used on your search head to auto extract these fields. 

For example, on the search head(s)

transforms.conf

 

...
[xml-extract]
REGEX = ^<(?:cm:)*([^\>]+)>([^<]+)
FORMAT = $1::$2

 

props.conf (references the transforms rule)

 

...
[...your sourcetype...]
REPORT-extractXMLfields = xml-extract

 

This can be done via the search head UI too.

View solution in original post

0 Karma

manuelostertag
Path Finder

Hi @mbasharat,

try this one:

 

\<.*\>(?<Fieldtoextract>.*)\<.*\>

 

yeahnah
Motivator

Hi @mbasharat 

I believe you're asking for a regex to just extract the compliance-check-id for the event.  Is this correct?   There are a few ways to do this but if you just want that one field then this will work for you.

...
| rex "id>(?<complianceCheckID>[a-fA-F0-9]+)\<"
...

Hope it helps.

0 Karma

mbasharat
Builder

Hi @yeahnah,

Appreciate your support first of all. I need all the fields extracted that are coming in tags:

<cm:sample>sample</cm:sample>

 

0 Karma

yeahnah
Motivator

Hi @mbasharat 

Ah OK.   Have you looked at the xpath command then.   It should automatically be able to do this for you.

https://docs.splunk.com/Documentation/Splunk/8.0.6/SearchReference/Xpath

Otherwise, using transforms.conf and props.conf configuration can be used on your search head to auto extract these fields. 

For example, on the search head(s)

transforms.conf

 

...
[xml-extract]
REGEX = ^<(?:cm:)*([^\>]+)>([^<]+)
FORMAT = $1::$2

 

props.conf (references the transforms rule)

 

...
[...your sourcetype...]
REPORT-extractXMLfields = xml-extract

 

This can be done via the search head UI too.

0 Karma

mbasharat
Builder

Thanks you!!!

0 Karma
Get Updates on the Splunk Community!

Application management with Targeted Application Install for Victoria Experience

  Experience a new era of flexibility in managing your Splunk Cloud Platform apps! With Targeted Application ...

Index This | What goes up and never comes down?

January 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Splunkers, Pack Your Bags: Why Cisco Live EMEA is Your Next Big Destination

The Power of Two: Splunk &#43; Cisco at "Ludicrous Scale"   You know Splunk. You know Cisco. But have you seen ...