Splunk Search

How to create a regex to match URL ending with file extension to detect file downloads?

jkumarr2
New Member

I am trying to write a regex which will detect/match URLs ending with 2, 3 & 4 letter file extensions (eg - .py, .txt, xlsx and the numerous other known file extensions) . I used the regex Splunk search:

|regex field url=".*[a-zA-Z]{2-4}$"

but this will match URLs like www.liverpoolfc.com which does not end with file extensions.

Also tried with this regex:

| regex url="//.+?/.+?.$" 

Which will look for the http: or https: then two "/" followed by the top level domain and one "/" followed by any stream of character and ending with 2 to 4 letter word, but this is not giving the correct results, its omitting few URLs which have multiple "/" in the full URL path, any better suggestions ?

Below is a sample set of URLs that I used as a reference:

http://www.liverpoolfc.com
http://www.blackberry.com
http://www.lflogistics.com/sites/default/files/news/lflstc.pdf
https://www.abc.com/tiny/7uwi2
https://download.abc.com/download/ep/FE-90CRC000-28.zip
http://www3.abce.hk/listedco/listconews/SEHK/2019/0521/LTN20190521894.pdf
https://www.abc.com/review/www.xyz-center.com
https://xyz.abc.com/abc-voyager.php
http://wealthbriefing.com/forms/view.php?id=1456762⪙ement_34=saint.xyz@gmail.com
0 Karma

woodcock
Esteemed Legend

Like this:

... |regex url="^https?:\/\/.*[\\\/].+\.[a-zA-Z]{2,4}$"
0 Karma

jnudell_2
Builder

Hi @jkumarr2 ,

I would use something like this:

... your search ...
| regex url="(https?:\/\/)?([A-Za-z0-9\-]+)?\.([A-Za-z0-9\-]+)\.([A-Za-z0-9\-]+)(\/?.*\/(.+\.[A-Za-z]{2,3})$)"

or maybe:

... your search ...
| regex url=".*\/\/[^\/]+\/?.*\/.*\.[A-Za-z]{2,3}"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi jkumarr2,
try this one

(?P<URL>[^ ]*\.\w*)$

You can test it at https://regex101.com/r/2syl1Z/1

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

.conf25 Registration is OPEN!

Ready. Set. Splunk! Your favorite Splunk user event is back and better than ever. Get ready for more technical ...

Detecting Cross-Channel Fraud with Splunk

This article is the final installment in our three-part series exploring fraud detection techniques using ...

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Pack your bags (and maybe your dancing shoes)—Cisco Live is heading to San Diego, June 8–12, 2025, and Splunk ...