Getting Data In

How to trim base domain from url

jgrantham
Explorer

I have several pieces of data that look like this:

subdomain1.domain.com
subdomain2.domain.com

Question is how do I only pull the domain.com part in a Splunk search?

0 Karma
1 Solution

jgrantham
Explorer

Finally got it to work. Here is what I came up with:

     index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" | rex field=dest_host "\.(?<domainname>\S+.\S+)$" |table domainname | stats count by domainname

Here is output:

    domainname  count
   whitepages.com   2
    yellowpages.ca  21

View solution in original post

0 Karma

jgrantham
Explorer

Finally got it to work. Here is what I came up with:

     index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" | rex field=dest_host "\.(?<domainname>\S+.\S+)$" |table domainname | stats count by domainname

Here is output:

    domainname  count
   whitepages.com   2
    yellowpages.ca  21
0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are a few ways to do that. The URL Toolbox app adds custom commands that will parse URLs. There's also the URL Parser app, but I have no experience with it.

If your data is not too complex, you can also parse it yourself using the rex command. This puts the domain.com part into the 'domain' field, but may need to be adjusted to suit your real data.

... | rex ".*?\.(?<domain>.*)" | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma

jgrantham
Explorer

Here is what I am using:

| rex field=dest_host".?.(?.)"
|table domain

What am I doing wrong? dest_host contains data like www. domain.com or sub.domain.com. I want to pull a count of domain.com. I am new to splunk and can't figure this out.

Any help would be greatly appreciated.

0 Karma

jgrantham
Explorer

Here is what I am using:

    index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" |  table user dest|  stats count by dest

Here is what I get back:

dest    count

1 static.yellowpages.ca 19
2 www.yellowpages.ca 2

Here is what I would like to see:

dest count
yellowpages.ca 21

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please edit your comment to indent the SPL at least 4 spaces so the SPL is preserved. Tell us what you get from the SPL and what you expected to see.

---
If this reply helps you, Karma would be appreciated.
0 Karma

jgrantham
Explorer

Ok. I am running a query where one of the fields is dest_host. This will bring back results like www.domain.com, sub.domain.com, sub1.domain.com. I am trying to get a count based on the total using everything that includes domain.com. I currently have to do this manually and it is a pain.

Here is the SPL:
index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories"
| table user dest
| stats count by dest

What I get is a table with the following:
dest count
1 static.yellowpages.ca 19
2 www.yellowpages.ca 2

What I would like to see is :
yellowpages.ca 21

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...