Splunk Search

How to extract only the top level domain (TLD) from email addresses?

ICAJschuster
Engager

I am working with an email application. Currently doing a report based on domains using the product. Issue is there are many, and often arbitrary sub-domains. This is what I'm currently using:

rex field=Sender ".[^@]+?@(?<sender_domain>.+)"

The results from that look like:
test.com
sub.test.com
why.so.many.subs.echo.com
a.echo.com

So what is the "right" way to get the last 2 fields on either side of the last DOT in the field?
This is close but it only matches IF there is a subdomain and many are just TLD:

   rex field=Sender ".[^@]+?@.*(?<sender_domain>\.\w+\.[a-zA-Z]+$)"

Thanks!

1 Solution

cpetterborg
SplunkTrust
SplunkTrust

Try:

rex field=Sender "(?P<sender_domain>[A-Za-z0-9]+\.[a-zA-Z]+)$"

All you need is to look at the last part, not the whole email to get what you need, and this will find it easily.

View solution in original post

cpetterborg
SplunkTrust
SplunkTrust

Try:

rex field=Sender "(?P<sender_domain>[A-Za-z0-9]+\.[a-zA-Z]+)$"

All you need is to look at the last part, not the whole email to get what you need, and this will find it easily.

acharlieh
Influencer

This is indeed what was asked for, however, depending on what you're doing with this, you may want to look a bit deeper:

1) "Top Level domains" for some country codes you may actually want the 3rd level. For example: "amazon.co.uk"
2) You need to include hyphens and other characters as well, otherwise you may miss some domains. Of note internationalized domain names are actually prefixed: xn--

ICAJschuster
Engager

Perfect! Thank you so much!

0 Karma
Get Updates on the Splunk Community!

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...