Extract domains from raw data into a new field and...

lilvermi · ‎11-21-2021

I have raw data, I would like to search for domains within the data, output it to a field and then run stats to show a count of each unique domain.

Example of raw data:

"This investigation is really great and we found the suspicious domain google.com"

I would like to:
1. search for domains within raw data and output the domain to a field that I can show in a table (Lets call it "Domain")
2. run stats that show the number of occurrences

So ideally, my finished result would be:

Domain	count
google.com	50
yahoo.com	30

Any assistance is greatly appreciated, thank you.

bowesmana · ‎11-21-2021

Key is how to recognise a domain. You can google for regex to extract domains and get some examples, but this search will show you how to get started

| makeresults
| eval d=split("google.com,abc.net.au,bbc.co.uk,google.com,splunk.com,www.nytimes.com", ",")
| mvexpand d
| rex field=d "(?<domain>(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9])"
| stats count by domain

In your example, use rex field=_raw rather than 'd' in the above.

If you might have more than one domain in your raw data then add the 'max_match=0' to the rex statement

Extract domains from raw data into a new field and create a table with count

count

eval

field extraction

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation