Splunk Cloud Platform

REGEX TO CAPTURE EVERYTHING EXCEPT HTML TAGS(</> <P>)

splunkerninja1
Explorer

I need to capture everything except the html tags like </a> <a> </p> </b>. These tags may appear anywhere in the raw data.

I was able to come up with regex that matches non capturing group (?:<\/?\w>) but I am stuck with not able to capture the rest everything in raw data.

 

Sample:

 

 Explorer is a web-browser developed by Microsoft which is included in Microsoft Windows Operating Systems.<P>
Microsoft has released Cumulative Security Updates for Internet Explorer which addresses various vulnerabilities found in Internet Explorer 8 (IE 8), Internet Explorer 9 (IE 9), Internet Explorer 10 (IE 10) and Internet Explorer 11 (IE 11). <P>

KB Articles associated with the Update:<P>
1) 4908777<BR>
2) 879586<BR>
3) 9088783<BR>
4) 789792<BR>
5) 0973782<BR>
6) 098781<BR>
7) 1234788<BR>
8) 8907799<BR><BR>

Please Note - CVE-2020-9090 required extra steps to be manually applied for being fully patched. Please refer to the FAQ seciton for <A HREF='https://portal.mtyb.windows.com/en-PK/WINDOWS-guidance/advisory/CVE-2020-9090 ' TARGET='_blank'>CVE-2020-9090 .</A><P>

QID Detection Logic (Authenticated):<BR>

Additionally the QID checks if the required Registry Keys are enabled to fully patch  <A HREF='https://portal.msrc.windows.com/en-US/guidance/advisory/CVE-2014-82789' TARGET='_blank'>CVE-2014-2897.</A> (See FAQ Section) <BR>

The keys to be patched are: <BR>
&quot;whkl\SOFTWARE\Microsoft\Internet Explorer\Main\FEATURE_ENABLE_PASTE_INFO_DISCLOSURE_FIX&quot; value &quot;iexplore.exe&quot; set to &quot;1&quot;.<BR>
Tags (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw mode=sed "s/<\/?\w+.*?\/?>//g"

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw mode=sed "s/<\/?\w+.*?\/?>//g"
0 Karma

splunkerninja1
Explorer

@ITWhisperer Thanks to you. I have an issue I need to use the same regex on two different fields butit throws an error when i run the below query 

| inputlookup remediation.csv 
| stats count by knowbe4, solution 
| rex field=knowbe4 mode=sed "s/<\/?\w+.*?\/?>//g" rex field=solution mode=sed "s/<\/?\w+.*?\/?>//g"

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You need to use two commands

| inputlookup remediation.csv 
| stats count by knowbe4, solution 
| rex field=knowbe4 mode=sed "s/<\/?\w+.*?\/?>//g"
| rex field=solution mode=sed "s/<\/?\w+.*?\/?>//g"
Get Updates on the Splunk Community!

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had 3 releases of new security content via the Enterprise Security ...

Archived Metrics Now Available for APAC and EMEA realms

We’re excited to announce the launch of Archived Metrics in Splunk Infrastructure Monitoring for our customers ...