Hi I have this rex I'm trying to filter on for any URL that points to file extensions that have two or more extensions. So far I have this:
^(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/|hxxp:\/\/|hxxps:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$
Any help is appreciated. Thanks!
hmm still not sure but i will give this a try
| makeresults
| eval url="hxxp://static.zipcloud.com/a/zipcloud//img/footer.break.png.exe"
| rex field=url ".*\/(?<ext>.*)"
|eval ext=split(ext,".")
| eval ext_count=mvcount(ext)
Now, what this does is extract everything after the last /. you make this a mvfield and count the number of extensions.
This will give you the count, in the example above this gives a count of 3 , for footer,break and png.
so you know that anything that has a count greater than 1 has at least 2 dots , something like xx.yyy......
Thats the easy part.Now how you want to to match against all extensions is a bit tricky, you can compare a against some common extensions in the rex or using a like function. But I will wait to first hear from you on whether this works for you
for your use and assuming the field is named url you just need to copy and re-use code from the rex onwards
hmm still not sure but i will give this a try
| makeresults
| eval url="hxxp://static.zipcloud.com/a/zipcloud//img/footer.break.png.exe"
| rex field=url ".*\/(?<ext>.*)"
|eval ext=split(ext,".")
| eval ext_count=mvcount(ext)
Now, what this does is extract everything after the last /. you make this a mvfield and count the number of extensions.
This will give you the count, in the example above this gives a count of 3 , for footer,break and png.
so you know that anything that has a count greater than 1 has at least 2 dots , something like xx.yyy......
Thats the easy part.Now how you want to to match against all extensions is a bit tricky, you can compare a against some common extensions in the rex or using a like function. But I will wait to first hear from you on whether this works for you
for your use and assuming the field is named url you just need to copy and re-use code from the rex onwards
| rex field=url ".*\/(?<ext>.*)"
| eval ext=split(ext,".")
| eval ext_count=mvcount(ext)
This works great! So the split tells you how many sections are separated by dots. How do I only display ext_count of 3 or higher? How about 3 exactly?
Thanks!
| rex field=url ".*\/(?<ext>.*)"
| eval ext=split(ext,".")
| eval ext_count=mvcount(ext)
| search ext_count>=3
| dedup ext
Got it! Thanks for all your help!
glad to see you figured it out @fdevera . Sorry I am in IST times and it was too late in the night for me to see your comments,
If you are just aiming to get everything after the last slash, this is the regex:
^.*\/([^\/]+)$
https://regex101.com/r/y0D5rr/1
If you'd like to fine tune it to clarify extensions, you can do something like this:
^.*\/([^\/]+\.(png|pdf|docx|scr|exe))$
https://regex101.com/r/tv2Th5/1
I added this:
|rex url="^.*\/([^\/]+)$"
And received this error:
Error in 'rex' command: The regex 'url=^.*\/([^\/]+)$' does not extract anything. It should specify at least one named group. Format: (?...).
Apologies, I was just trying to assist with the regex. If that's the error, here's what you need:
^.*\/(?<ThisIsWhatIWantMyFieldNamed>([^\/]+))$
hi @fdevera
can you share a sample event and what all you want to extract?
index=webproxy |table url
example output:
hxxp://static.zipcloud.com/a/zipcloud//img/footer.break.png
I only want to display events with url that have more than one extension. I know this will be difficult because of ransom existence of periods which will cause alot of false positives but that's fine. Any ideas to reduce that would be great too.
hi @fdevera
bit confused on the 'estensions', is it 2 here because of footer.break.png containing 2 dots? or how do you calculate the extensions for this url?
I'm looking for direct links to files that have two extensions like .docx.scr or .pdf.exe. What would be the best way to do that in rex? I'm ok with false positives in the results.
uh ha so the example you gave above
hxxp://static.zipcloud.com/a/zipcloud//img/footer.break.png
qualifies as it has break.png, right?
What am I doing wrong here?
| rex field=url "^.*\/([^\/]+)$" | table urlrisk_gibson src_host src_ip dst_host dst_ip mwg_client_sent sent user_agent url field10 http_message http_method http_response http_version
As they said, we need to see your data and what you expect to see in order to help you.
Correct - no way around that since extensions can have more than 3 letters, sometimes 5 or 6. And filenames commonly have periods in them. At the very least I'd like to limit my results to those that have only two periods in the file name.
Agree. For questions like this, sample data is required as just based on regex, we can know what your current regex is doing but can't know if it's doing what you want. Please share which events/values you want to include and which you want to exclude. Please scrub any sensitive data while posting samples.