Getting Data In

Top-Level Domain Extraction (from URLs)

Explorer

So I've searched and searched and can't find a regex that quite fits what I want to do...What I'd like to do is extract just the ".com", ".net", ".org", etc from a URL.
My "domain" field shows: "http://cdn.springserve.com" or "https://www.allpennystocks.org", etc (for example).

I also get "www.familylifeins.com/Resources/Shared/scripts/widgets.js" sometimes in the domain field and of course I want to drop everything but the ".com"

What I need is just the top-level domain (".com", ".net", ".org", etc), and I've tried several different regex's I found here, but they don't quite work the way I need it to.

Basically I want to create a list of all the TLDs my company uses in a 90 day period.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

Try this regex string.

(?<TLD>\.\w+?)(?:$|\/)
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

Explorer

Thank you everyone...VERY much!

0 Karma

Esteemed Legend

Don't forget to click Accept on the best answer (for you) and upvote anything else that was helpful.

0 Karma

Esteemed Legend

SplunkTrust
SplunkTrust

Try this regex string.

(?<TLD>\.\w+?)(?:$|\/)
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

Champion

Great minds think alike 😉

0 Karma