Getting Data In

Shorten a URL to it's Primary Domain Name from Bluecoat Logs

john5916
Engager

I'd like to shorten a URL collected from bluecoat logs so that it only lists the primary domain name.

For example:

abcvod.abcnews.com to just abcnews.com

or

anything.google.com to just google.com

I've searched the previous questions and I've not found any working options.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Here's a crude, non-RegeEx way to do it:

| makeresults | eval domain="e.f.com" | eval parts=split(domain,"."), c=mvcount(parts) | eval last2=mvindex(parts, c-2).".".mvindex(parts, c-1)

In RegEx, you can simply anchor to the end of the full domain name string, no? Like so:

| makeresults | eval domain="e.f.com" | rex field=domain "(?<last2>\w+\.\w+)$"

Probably needs some work to cover cases where there are non-word characters in the domain name, but the principle should apply.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Im assuming you mean extract this at search time, as opposed to change this as its indexed via transforms..

Have you checked out this Answers Post : https://answers.splunk.com/answers/542835/top-level-domain-extraction-from-urls.html

There's also a few links in there to some apps on Splunkbase that could assist in further domain analysis also.

john5916
Engager

That's basically what I need. I'm not up to speed on Regex though, and I need to take it one . further up the FQDN.

Instead of tracking the .com's as suggested, I want the facebook.com, etc

0 Karma

john5916
Engager

This link goes the opposite way, and does closer to what I need.

https://answers.splunk.com/answers/523064/eval-regex-for-host-name-from-fqdn.html

This does what I need -

eval hostname=replace(hostname,"^([^.]+).+","\1")

But it is the very first part of the FQDN. So i can get the start, or the end. What I need though is facebook.com, cnn.com, etc

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...