Splunk Cloud Platform

REGEX TO CAPTURE EVERYTHING EXCEPT HTML TAGS(</> <P>)

splunkerninja1
Explorer

I need to capture everything except the html tags like </a> <a> </p> </b>. These tags may appear anywhere in the raw data.

I was able to come up with regex that matches non capturing group (?:<\/?\w>) but I am stuck with not able to capture the rest everything in raw data.

 

Sample:

 

 Explorer is a web-browser developed by Microsoft which is included in Microsoft Windows Operating Systems.<P>
Microsoft has released Cumulative Security Updates for Internet Explorer which addresses various vulnerabilities found in Internet Explorer 8 (IE 8), Internet Explorer 9 (IE 9), Internet Explorer 10 (IE 10) and Internet Explorer 11 (IE 11). <P>

KB Articles associated with the Update:<P>
1) 4908777<BR>
2) 879586<BR>
3) 9088783<BR>
4) 789792<BR>
5) 0973782<BR>
6) 098781<BR>
7) 1234788<BR>
8) 8907799<BR><BR>

Please Note - CVE-2020-9090 required extra steps to be manually applied for being fully patched. Please refer to the FAQ seciton for <A HREF='https://portal.mtyb.windows.com/en-PK/WINDOWS-guidance/advisory/CVE-2020-9090 ' TARGET='_blank'>CVE-2020-9090 .</A><P>

QID Detection Logic (Authenticated):<BR>

Additionally the QID checks if the required Registry Keys are enabled to fully patch  <A HREF='https://portal.msrc.windows.com/en-US/guidance/advisory/CVE-2014-82789' TARGET='_blank'>CVE-2014-2897.</A> (See FAQ Section) <BR>

The keys to be patched are: <BR>
&quot;whkl\SOFTWARE\Microsoft\Internet Explorer\Main\FEATURE_ENABLE_PASTE_INFO_DISCLOSURE_FIX&quot; value &quot;iexplore.exe&quot; set to &quot;1&quot;.<BR>
Tags (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw mode=sed "s/<\/?\w+.*?\/?>//g"

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw mode=sed "s/<\/?\w+.*?\/?>//g"
0 Karma

splunkerninja1
Explorer

@ITWhisperer Thanks to you. I have an issue I need to use the same regex on two different fields butit throws an error when i run the below query 

| inputlookup remediation.csv 
| stats count by knowbe4, solution 
| rex field=knowbe4 mode=sed "s/<\/?\w+.*?\/?>//g" rex field=solution mode=sed "s/<\/?\w+.*?\/?>//g"

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You need to use two commands

| inputlookup remediation.csv 
| stats count by knowbe4, solution 
| rex field=knowbe4 mode=sed "s/<\/?\w+.*?\/?>//g"
| rex field=solution mode=sed "s/<\/?\w+.*?\/?>//g"
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...