Splunk Search

Transforms to Remove HTML Tags

DBattisto
Communicator

Hello-

I'm importing data from a SQL database that includes HTML tags. Here is an example:

NoteText="This is my first sentence. This is <strong>bold test</strong></br> 
<p>Second Line</p>
<p> </p>
<p>new line</p>

I'm looking for a way to utilize transforms and props OR regex in the search to remove any HTML tags and just display the data as such. I've used these methods for removing XML tags, but those were symmetrical and structured, I'm not familiar with how to do it for random tags throughout.

Thanks!

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

You can do that at search time with rex. This regex is pretty basic and doesn't handle stray < characters, but may be suitable for your needs.

... | rex mode=sed "s/(\<[^>]+>)//g" | ...

You can use SEDCMD in props.conf to do the same thing.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

You can do that at search time with rex. This regex is pretty basic and doesn't handle stray < characters, but may be suitable for your needs.

... | rex mode=sed "s/(\<[^>]+>)//g" | ...

You can use SEDCMD in props.conf to do the same thing.

---
If this reply helps you, Karma would be appreciated.

DBattisto
Communicator

Thanks, I figured this out shortly after posting that it'd be easiest for implementing it this way, and for a specific field:

index=index1
| rex mode=sed field=FieldName3 "s/(<.*?>)//g"
| table FieldName1, FieldName2, FieldName3

The only think to watch out for is if someone puts text between '<>', it'll be omitted from the text. Doesn't look like that'll impact this application here.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...