Getting Data In

Top-Level Domain Extraction (from URLs)

dsmeerkat
Explorer

So I've searched and searched and can't find a regex that quite fits what I want to do...What I'd like to do is extract just the ".com", ".net", ".org", etc from a URL.
My "domain" field shows: "http://cdn.springserve.com" or "https://www.allpennystocks.org", etc (for example).

I also get "www.familylifeins.com/Resources/Shared/scripts/widgets.js" sometimes in the domain field and of course I want to drop everything but the ".com"

What I need is just the top-level domain (".com", ".net", ".org", etc), and I've tried several different regex's I found here, but they don't quite work the way I need it to.

Basically I want to create a list of all the TLDs my company uses in a 90 day period.

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Try this regex string.

(?<TLD>\.\w+?)(?:$|\/)
---
If this reply helps you, Karma would be appreciated.

View solution in original post

dsmeerkat
Explorer

Thank you everyone...VERY much!

0 Karma

woodcock
Esteemed Legend

Don't forget to click Accept on the best answer (for you) and upvote anything else that was helpful.

0 Karma

woodcock
Esteemed Legend

richgalloway
SplunkTrust
SplunkTrust

Try this regex string.

(?<TLD>\.\w+?)(?:$|\/)
---
If this reply helps you, Karma would be appreciated.

rjthibod
Champion

Great minds think alike 😉

0 Karma
Get Updates on the Splunk Community!

Incident Response: Reduce Incident Recurrence with Automated Ticket Creation

Culture extends beyond work experience and coffee roast preferences on software engineering teams. Team ...

Splunk Classroom Chronicles: Training Tales and Testimonials (Episode 2)

Welcome to the "Splunk Classroom Chronicles" series, created to help curious, career-minded learners get ...

Index This | I am a number but I am countless. What am I?

January 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  Happy New Year! We’re ...