Splunk Enterprise

Field extraction

mbasharat
Contributor

Hi,

I have a need for field extraction. I have a sourcetype that has compliance related information for our use case. This data has field name "Text". This field has data coming in variations. Below are two of the many variations. I need the extraction via regex that can detect fields within tags and parse them out. Data cardinality will be by:

 

 

<cm:compliance-check-id>36c4d07cc410439bf3bf79f7f5942672</cm:compliance-check-id>

 

 

Sample: 1

 

 

<cm:compliance-result>WARNING</cm:compliance-result> 
<cm:compliance-actual-value>Error -- evaluation period has ended</cm:compliance-actual-value> 
<cm:compliance-check-id>36c4d07cc410439bf3bf79f7f5942672</cm:compliance-check-id> 
<cm:compliance-policy-value>WARNING</cm:compliance-policy-value> 
<cm:compliance-check-name>Connection error</cm:compliance-check-name>

 

 

Sample: 2

 

 

<compliance>true</compliance> 
<cm:compliance-check-name>WN10-00-000005 - Domain-joined systems must use Windows 10 Enterprise Edition 64-bit version - 64-bit</cm:compliance-check-name> 
<cm:compliance-audit-file>DISA_STIG_Windows_10_v1r20.audit</cm:compliance-audit-file> 
<cm:compliance-check-id>55aeff4f26d6b8307f6f9672750a5548</cm:compliance-check-id> 
<cm:compliance-actual-value>'64-bit'</cm:compliance-actual-value> 
<cm:compliance-policy-value>'64-bit'</cm:compliance-policy-value> 
<cm:compliance-info> Features such as Credential Guard use virtualization based security to protect information that could be used in credential theft attacks if compromised. There are a number of system requirements that must be met in order for Credential Guard to be configured and enabled properly. Virtualization based security and Credential Guard are only available with Windows 10 Enterprise 64-bit version. </cm:compliance-info> 
<cm:compliance-result>PASSED</cm:compliance-result> 
<cm:compliance-reference>800-171|3.4.1,800-53|CM-8,CAT|II,CCI|CCI-000366,CN-L3|8.1.10.2(a),CN-L3|8.1.10.2(b),CSF|DE.CM-7,CSF|ID.AM-1,CSF|ID.AM-2,CSF|PR.DS-3,ISO/IEC-27001|A.8.1.1,ITSG-33|CM-8,NESA|T1.2.1,NESA|T1.2.2,NIAv2|NS35,Rule-ID|SV-77809r3_rule,STIG-ID|WN10-00-000005,Vuln-ID|V-63319</cm:compliance-reference> 
<cm:compliance-see-also>https://dl.dod.cyber.mil/wp-content/uploads/stigs/zip/U_MS_Windows_10_V1R20_STIG.zip</cm:compliance-see-also>

 

 

Thanks in-advance!!!

0 Karma
1 Solution

yeahnah
Communicator

Hi @mbasharat 

Ah OK.   Have you looked at the xpath command then.   It should automatically be able to do this for you.

https://docs.splunk.com/Documentation/Splunk/8.0.6/SearchReference/Xpath

Otherwise, using transforms.conf and props.conf configuration can be used on your search head to auto extract these fields. 

For example, on the search head(s)

transforms.conf

 

...
[xml-extract]
REGEX = ^<(?:cm:)*([^\>]+)>([^<]+)
FORMAT = $1::$2

 

props.conf (references the transforms rule)

 

...
[...your sourcetype...]
REPORT-extractXMLfields = xml-extract

 

This can be done via the search head UI too.

View solution in original post

0 Karma

manuelostertag
Path Finder

Hi @mbasharat,

try this one:

 

\<.*\>(?<Fieldtoextract>.*)\<.*\>

 

yeahnah
Communicator

Hi @mbasharat 

I believe you're asking for a regex to just extract the compliance-check-id for the event.  Is this correct?   There are a few ways to do this but if you just want that one field then this will work for you.

...
| rex "id>(?<complianceCheckID>[a-fA-F0-9]+)\<"
...

Hope it helps.

0 Karma

mbasharat
Contributor

Hi @yeahnah,

Appreciate your support first of all. I need all the fields extracted that are coming in tags:

<cm:sample>sample</cm:sample>

 

0 Karma

yeahnah
Communicator

Hi @mbasharat 

Ah OK.   Have you looked at the xpath command then.   It should automatically be able to do this for you.

https://docs.splunk.com/Documentation/Splunk/8.0.6/SearchReference/Xpath

Otherwise, using transforms.conf and props.conf configuration can be used on your search head to auto extract these fields. 

For example, on the search head(s)

transforms.conf

 

...
[xml-extract]
REGEX = ^<(?:cm:)*([^\>]+)>([^<]+)
FORMAT = $1::$2

 

props.conf (references the transforms rule)

 

...
[...your sourcetype...]
REPORT-extractXMLfields = xml-extract

 

This can be done via the search head UI too.

View solution in original post

0 Karma

mbasharat
Contributor

Thanks you!!!

0 Karma