Getting Data In

Shorten a URL to it's Primary Domain Name from Bluecoat Logs

john5916
Engager

I'd like to shorten a URL collected from bluecoat logs so that it only lists the primary domain name.

For example:

abcvod.abcnews.com to just abcnews.com

or

anything.google.com to just google.com

I've searched the previous questions and I've not found any working options.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Here's a crude, non-RegeEx way to do it:

| makeresults | eval domain="e.f.com" | eval parts=split(domain,"."), c=mvcount(parts) | eval last2=mvindex(parts, c-2).".".mvindex(parts, c-1)

In RegEx, you can simply anchor to the end of the full domain name string, no? Like so:

| makeresults | eval domain="e.f.com" | rex field=domain "(?<last2>\w+\.\w+)$"

Probably needs some work to cover cases where there are non-word characters in the domain name, but the principle should apply.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Im assuming you mean extract this at search time, as opposed to change this as its indexed via transforms..

Have you checked out this Answers Post : https://answers.splunk.com/answers/542835/top-level-domain-extraction-from-urls.html

There's also a few links in there to some apps on Splunkbase that could assist in further domain analysis also.

john5916
Engager

That's basically what I need. I'm not up to speed on Regex though, and I need to take it one . further up the FQDN.

Instead of tracking the .com's as suggested, I want the facebook.com, etc

0 Karma

john5916
Engager

This link goes the opposite way, and does closer to what I need.

https://answers.splunk.com/answers/523064/eval-regex-for-host-name-from-fqdn.html

This does what I need -

eval hostname=replace(hostname,"^([^.]+).+","\1")

But it is the very first part of the FQDN. So i can get the start, or the end. What I need though is facebook.com, cnn.com, etc

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Thanks for the Memories! Splunk University, .conf25, and our Community

Thank you to everyone in the Splunk Community who joined us for .conf25, which kicked off with our iconic ...