Splunk Search

How to create a regex to match URL ending with file extension to detect file downloads?

jkumarr2
New Member

I am trying to write a regex which will detect/match URLs ending with 2, 3 & 4 letter file extensions (eg - .py, .txt, xlsx and the numerous other known file extensions) . I used the regex Splunk search:

|regex field url=".*[a-zA-Z]{2-4}$"

but this will match URLs like www.liverpoolfc.com which does not end with file extensions.

Also tried with this regex:

| regex url="//.+?/.+?.$" 

Which will look for the http: or https: then two "/" followed by the top level domain and one "/" followed by any stream of character and ending with 2 to 4 letter word, but this is not giving the correct results, its omitting few URLs which have multiple "/" in the full URL path, any better suggestions ?

Below is a sample set of URLs that I used as a reference:

http://www.liverpoolfc.com
http://www.blackberry.com
http://www.lflogistics.com/sites/default/files/news/lflstc.pdf
https://www.abc.com/tiny/7uwi2
https://download.abc.com/download/ep/FE-90CRC000-28.zip
http://www3.abce.hk/listedco/listconews/SEHK/2019/0521/LTN20190521894.pdf
https://www.abc.com/review/www.xyz-center.com
https://xyz.abc.com/abc-voyager.php
http://wealthbriefing.com/forms/view.php?id=1456762⪙ement_34=saint.xyz@gmail.com
0 Karma

woodcock
Esteemed Legend

Like this:

... |regex url="^https?:\/\/.*[\\\/].+\.[a-zA-Z]{2,4}$"
0 Karma

jnudell_2
Builder

Hi @jkumarr2 ,

I would use something like this:

... your search ...
| regex url="(https?:\/\/)?([A-Za-z0-9\-]+)?\.([A-Za-z0-9\-]+)\.([A-Za-z0-9\-]+)(\/?.*\/(.+\.[A-Za-z]{2,3})$)"

or maybe:

... your search ...
| regex url=".*\/\/[^\/]+\/?.*\/.*\.[A-Za-z]{2,3}"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi jkumarr2,
try this one

(?P<URL>[^ ]*\.\w*)$

You can test it at https://regex101.com/r/2syl1Z/1

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization uses ...