Getting Data In

How to trim base domain from url

jgrantham
Explorer

I have several pieces of data that look like this:

subdomain1.domain.com
subdomain2.domain.com

Question is how do I only pull the domain.com part in a Splunk search?

0 Karma
1 Solution

jgrantham
Explorer

Finally got it to work. Here is what I came up with:

     index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" | rex field=dest_host "\.(?<domainname>\S+.\S+)$" |table domainname | stats count by domainname

Here is output:

    domainname  count
   whitepages.com   2
    yellowpages.ca  21

View solution in original post

0 Karma

jgrantham
Explorer

Finally got it to work. Here is what I came up with:

     index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" | rex field=dest_host "\.(?<domainname>\S+.\S+)$" |table domainname | stats count by domainname

Here is output:

    domainname  count
   whitepages.com   2
    yellowpages.ca  21
0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are a few ways to do that. The URL Toolbox app adds custom commands that will parse URLs. There's also the URL Parser app, but I have no experience with it.

If your data is not too complex, you can also parse it yourself using the rex command. This puts the domain.com part into the 'domain' field, but may need to be adjusted to suit your real data.

... | rex ".*?\.(?<domain>.*)" | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma

jgrantham
Explorer

Here is what I am using:

| rex field=dest_host".?.(?.)"
|table domain

What am I doing wrong? dest_host contains data like www. domain.com or sub.domain.com. I want to pull a count of domain.com. I am new to splunk and can't figure this out.

Any help would be greatly appreciated.

0 Karma

jgrantham
Explorer

Here is what I am using:

    index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" |  table user dest|  stats count by dest

Here is what I get back:

dest    count

1 static.yellowpages.ca 19
2 www.yellowpages.ca 2

Here is what I would like to see:

dest count
yellowpages.ca 21

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please edit your comment to indent the SPL at least 4 spaces so the SPL is preserved. Tell us what you get from the SPL and what you expected to see.

---
If this reply helps you, Karma would be appreciated.
0 Karma

jgrantham
Explorer

Ok. I am running a query where one of the fields is dest_host. This will bring back results like www.domain.com, sub.domain.com, sub1.domain.com. I am trying to get a count based on the total using everything that includes domain.com. I currently have to do this manually and it is a pain.

Here is the SPL:
index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories"
| table user dest
| stats count by dest

What I get is a table with the following:
dest count
1 static.yellowpages.ca 19
2 www.yellowpages.ca 2

What I would like to see is :
yellowpages.ca 21

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...