Splunk Search

How to use regex/rex to extract filename from URI

Explorer

I am looking for a way to extract filenames of executable files from a URL in proxy logs. The url field in my logs contain the full URL. Here are a few examples. I think we just need to capture everything past the last "/" if it contains 3 or 4 chars after the last ".". Has anyone done anything like this?

url=http://www.kaco.net/download/kacotv.exe
url=http://acroipm2.adobe.com/15/rdr/ENU/win/nooem/none/consumer/message.zip
url=https://prod308-client.redplum.com/protocol/install/P@H_prod308-1dF7CZ5x.exe
url=http://download.microsoft.com/download/5/3/D/53D3880B-25F8-4714-A4AC-E463A492F96E/41212.00/Silverlight_x64.exe
url=http://download.flv.com/kits/flvd/flvdownloader_setup.exe
Tags (4)
0 Karma

SplunkTrust
SplunkTrust

Try this run anywhere sample. This is the regex that I use for any field extraction related to URL to extract other information as well

| gentimes start=-1 | eval url="http://www.kaco.net/download/kacotv.exe" | rex field=url "(?P<requestedUrl>(?P<path>\/(((?P<contextRoot>[^\/]+))(\S+\/)*(?P<filename>[^\/\?;=\s]+)([^\s]*))))" 

Replace "| gentimes...| eval ulr..." portion with your base search.

Esteemed Legend

Like this:

... | rex field=url "^.*\/(?<programname>[^\.\/]+\.(?:[^\.\/]){3,4})$"
0 Karma

Explorer

Hi woodcock,

Thank you for you speedy reply. I tried copying and pasting your solution into splunk and it doesnt return any results.

...| rex field=url ".^.*\/(?<filename>[^\.\/]+\.(?:[^\.\/]){3,4})$" | top filename

Any ideas on what I could be missing?

0 Karma

Esteemed Legend

There was an extra period (".") at the start of the RegEx. I have fixed it; try again.

0 Karma