topic Re: Extracting File Names from URL String in Splunk Search

Extracting File Names from URL String

TucoRameriz — Mon, 06 May 2013 18:15:49 GMT

Hello All,

Having some trouble coming up with a way to extract a file with three random characters and a .jnlp extension from the URI.

Here is what I've attempted to so far. Any assistance would be greatly appreciated.

index=wsa .jnlp | rex field=csurl (?) | regex csurl="\/[a-z0-9]{3}.jnlp$"

Re: Extracting File Names from URL String

kristian_kolb — Mon, 06 May 2013 20:04:08 GMT

If you have the field csurl already defined, something like this should work.

index=wsa csurl=*.jnlp | rex field=csurl "(?<my_new_field>\w{3})\.jnlp$"

If the filename (excluding the extension) is shorter than 3 - the field extraction will fail. If the filename (excluding extension) is longer than 3 - only the last 3 will be extracted into the new field.

Re: Extracting File Names from URL String

TucoRameriz — Mon, 06 May 2013 20:51:54 GMT

Thanks for the reply. The one question I have is in regards to the new field? Rex field extraction is not one of my strong points yet. Do I just give it a random name?

Thanks

Re: Extracting File Names from URL String

kristian_kolb — Tue, 07 May 2013 07:20:21 GMT

Well. Perhaps not random, but more or less arbitrary. Some hints, though:
- Use underscores instead of hyphens.
- Must not start with a number.
- Pick a name that makes sense.

Remember that you can always change a field extraction later, but...if you do, you'll have to alter all tags, eventtypes, saved searches etc that uses the (old) field name.

So if you have another log file that you want to correlate with, it could be a good idea to use the same field name here, e.g. a client ip address could/should always be extracted as clientip, regardless of generating system.

Re: Extracting File Names from URL String

TucoRameriz — Mon, 28 Sep 2020 13:51:02 GMT

Gave it a try and this string returns all .jnlp files.

index=wsa cs_url=*.jnlp | rex field=cs_url "(?\w{3}).jnlp$"

Any thoughts

Re: Extracting File Names from URL String

kristian_kolb — Tue, 07 May 2013 14:39:38 GMT

but of course, that's what you're searching for.

You could add a | search file_extract=* at the end, which requires that the field exists, regardless of its value, The field will not be set if the rex does not match.

Re: Extracting File Names from URL String

krugger — Tue, 07 May 2013 14:41:26 GMT

Wasn't that what you required? Give an input and expected output example

Re: Extracting File Names from URL String

TucoRameriz — Tue, 07 May 2013 15:05:17 GMT

I was looking to extract only JNLP files with a three character file name 123.jnlp or abc.jnlp

Thanks