Splunk Cloud Platform

REGEX TO CAPTURE EVERYTHING EXCEPT HTML TAGS(</> <P>)

splunkerninja1
Explorer

I need to capture everything except the html tags like </a> <a> </p> </b>. These tags may appear anywhere in the raw data.

I was able to come up with regex that matches non capturing group (?:<\/?\w>) but I am stuck with not able to capture the rest everything in raw data.

 

Sample:

 

 Explorer is a web-browser developed by Microsoft which is included in Microsoft Windows Operating Systems.<P>
Microsoft has released Cumulative Security Updates for Internet Explorer which addresses various vulnerabilities found in Internet Explorer 8 (IE 8), Internet Explorer 9 (IE 9), Internet Explorer 10 (IE 10) and Internet Explorer 11 (IE 11). <P>

KB Articles associated with the Update:<P>
1) 4908777<BR>
2) 879586<BR>
3) 9088783<BR>
4) 789792<BR>
5) 0973782<BR>
6) 098781<BR>
7) 1234788<BR>
8) 8907799<BR><BR>

Please Note - CVE-2020-9090 required extra steps to be manually applied for being fully patched. Please refer to the FAQ seciton for <A HREF='https://portal.mtyb.windows.com/en-PK/WINDOWS-guidance/advisory/CVE-2020-9090 ' TARGET='_blank'>CVE-2020-9090 .</A><P>

QID Detection Logic (Authenticated):<BR>

Additionally the QID checks if the required Registry Keys are enabled to fully patch  <A HREF='https://portal.msrc.windows.com/en-US/guidance/advisory/CVE-2014-82789' TARGET='_blank'>CVE-2014-2897.</A> (See FAQ Section) <BR>

The keys to be patched are: <BR>
&quot;whkl\SOFTWARE\Microsoft\Internet Explorer\Main\FEATURE_ENABLE_PASTE_INFO_DISCLOSURE_FIX&quot; value &quot;iexplore.exe&quot; set to &quot;1&quot;.<BR>
Tags (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw mode=sed "s/<\/?\w+.*?\/?>//g"

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex field=_raw mode=sed "s/<\/?\w+.*?\/?>//g"
0 Karma

splunkerninja1
Explorer

@ITWhisperer Thanks to you. I have an issue I need to use the same regex on two different fields butit throws an error when i run the below query 

| inputlookup remediation.csv 
| stats count by knowbe4, solution 
| rex field=knowbe4 mode=sed "s/<\/?\w+.*?\/?>//g" rex field=solution mode=sed "s/<\/?\w+.*?\/?>//g"

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You need to use two commands

| inputlookup remediation.csv 
| stats count by knowbe4, solution 
| rex field=knowbe4 mode=sed "s/<\/?\w+.*?\/?>//g"
| rex field=solution mode=sed "s/<\/?\w+.*?\/?>//g"
Get Updates on the Splunk Community!

Get ready to show some Splunk Certification swagger at .conf24!

Dive into the deep end of data by earning a Splunk Certification at .conf24. We're enticing you again this ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Now On-Demand Join us to learn more about how you can leverage Service Level Objectives (SLOs) and the new ...

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...