Getting Data In

Extracting domain names (mvindex?)

howyagoin
Contributor

I get the feeling this is going to be a tough one to solve, but, I'm trying to aggregate results of a search based upon domain name. I realise that this is a bit of a non-starter simply because of things like fubar.com versus fubar.co.uk, but, my first approach to this was:

search term | eval mydomain=split(dest_host,".") | eval tld=mvindex(mydomain,-1) | eval target=mvindex(mydomain,1) | eval hoster=target.".".tld 

And this works most of the time, but not all of the time.

I'm operating on the (questionable) assumption that the last two elements split by dest_host are likely to be the domain name - but maybe there's a better way to perform this aggregation. I'm trying to group together results which might be for host123-ab.fubar.com and host445-qx.fubar.com under fubar.com, for example.

I suppose another way to do this is to use some sort of a lookup table with all well-known TLDs and major sub-domains (.co.uk, .ac.uk and so forth) -- but it feels like a problem others must have tried to resolve here already.

Suggestions welcome!

Tags (2)

woodcock
Esteemed Legend

There are apps for this:

URL Parser: https://splunkbase.splunk.com/app/1545/
URL Toolbox: https://splunkbase.splunk.com/app/2734/

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...