Getting Data In

How to trim base domain from url

jgrantham
Explorer

I have several pieces of data that look like this:

subdomain1.domain.com
subdomain2.domain.com

Question is how do I only pull the domain.com part in a Splunk search?

0 Karma
1 Solution

jgrantham
Explorer

Finally got it to work. Here is what I came up with:

     index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" | rex field=dest_host "\.(?<domainname>\S+.\S+)$" |table domainname | stats count by domainname

Here is output:

    domainname  count
   whitepages.com   2
    yellowpages.ca  21

View solution in original post

0 Karma

jgrantham
Explorer

Finally got it to work. Here is what I came up with:

     index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" | rex field=dest_host "\.(?<domainname>\S+.\S+)$" |table domainname | stats count by domainname

Here is output:

    domainname  count
   whitepages.com   2
    yellowpages.ca  21
0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are a few ways to do that. The URL Toolbox app adds custom commands that will parse URLs. There's also the URL Parser app, but I have no experience with it.

If your data is not too complex, you can also parse it yourself using the rex command. This puts the domain.com part into the 'domain' field, but may need to be adjusted to suit your real data.

... | rex ".*?\.(?<domain>.*)" | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma

jgrantham
Explorer

Here is what I am using:

| rex field=dest_host".?.(?.)"
|table domain

What am I doing wrong? dest_host contains data like www. domain.com or sub.domain.com. I want to pull a count of domain.com. I am new to splunk and can't figure this out.

Any help would be greatly appreciated.

0 Karma

jgrantham
Explorer

Here is what I am using:

    index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories" |  table user dest|  stats count by dest

Here is what I get back:

dest    count

1 static.yellowpages.ca 19
2 www.yellowpages.ca 2

Here is what I would like to see:

dest count
yellowpages.ca 21

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please edit your comment to indent the SPL at least 4 spaces so the SPL is preserved. Tell us what you get from the SPL and what you expected to see.

---
If this reply helps you, Karma would be appreciated.
0 Karma

jgrantham
Explorer

Ok. I am running a query where one of the fields is dest_host. This will bring back results like www.domain.com, sub.domain.com, sub1.domain.com. I am trying to get a count based on the total using everything that includes domain.com. I currently have to do this manually and it is a pain.

Here is the SPL:
index=us_cseo_prod_webproxy sourcetype=mcafee:wg:kv action_name=allow policydecidingaccess="Allow Hosts in Global Whitelist - Telephone Directories"
| table user dest
| stats count by dest

What I get is a table with the following:
dest count
1 static.yellowpages.ca 19
2 www.yellowpages.ca 2

What I would like to see is :
yellowpages.ca 21

0 Karma
Get Updates on the Splunk Community!

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...

September Community Champions: A Shoutout to Our Contributors!

As we close the books on another fantastic month, we want to take a moment to celebrate the people who are the ...

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...