Splunk Search

How to group similar domain/URL patterns?

ail321
Engager

I would like to group URL fields and get a total count. 

When  I do this:

 

 

 

index=example source=example_example dest="*.amazonaws.com" OR dest="*.amazoncognito.com" OR dest="slack.com" OR dest="*.docker.io" | dedup dest | table dest | stats count by dest

 

 

 

the output is this:

dest count

352532535.abc.def.eu-xxxxx-1.amazonaws.com1
abc.auth.xx-aaaa-1.amazoncognito.com1
aaa1-stage-login-abcdef.auth.xx-abcd-1.amazoncognito.com1
346345452.abc.def.us-abcd-2.amazonaws.com1
autoscaling.xx-east-4.amazonaws.com1
slack.com1
registry-1.docker.io
1
auth.docker.io1

 

I wanted to group them by similar patterns like this:

gruopedURL count

.amazonaws.com3
.amazoncognito.com2
slack.com1
.docker.io
2

 

I've tried other possible queries based on some postings here, but no luck. It was mostly after the '.com'

Labels (1)
Tags (4)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Since you appear to already know what "common" parts of the urls you are interested in since they are part of our search filter, you could just count them

| stats count(eval(match(dest,"amazonaws\.com"))) as amazonaws.com count(eval(match(dest,"amazoncognito\.com"))) as amazoncognito.com count(eval(match(dest,"slack\.com"))) as slack.com count(eval(match(dest,"docker\.io"))) as docker.io

ail321
Engager

this worked. What if I add to search something like this 170.51.31.0/22

0 Karma

johnhuang
Motivator

If you're searching for the literal string "170.51.31.0/22":

index=example source=example_example "170.51.31.0/22"
| stats count by <field_name>


If you're searching for ip addresses that falls into the CIDR range "170.51.31.0/22"

 

index=example source=example_example
| search cidrmatch("170.51.31.0/22", <dest_ip_field> )

 

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You do way too many things in your search which actually slow it down.

Lose the dedup and lose the table. Just search and stat. That's for starters.

Secondly, use rex to extract the top part of the domain. Then do your stats

<base search>
| rex field=dest "(.*\.)?(?<base>[^.]+\.[^.]+)"
| stats count by base

 

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...