Splunk Search

How to remove \ (backslash) using from URLs rex sed?

Builder

I am trying to remove the escaped characters of "\" from the URLs coming in via a Twitter REST feed. Does anyone have the secret sauce for forming a rex field= mode=sed?

Sample URL: http:\/\/pbs.twimg.com\/media\/CoyGo5cUsAEmIZF.jpg

Thanks!

0 Karma
1 Solution

Contributor

This works for me in the search window:

| eval yourfieldname=replace(yourfieldname,"\\\\(.)","\1")

EDIT: a few words of explanation... the string "\\\\(.)" actually corresponds to the regex \\(.) which will match a single backslash followed by any character. The backslash has to be escaped once for the regex and another time to be in a double-quoted string, hence why one becomes four. If you're using the regex in a .conf file, depending how you do it, you don't need to escape it twice. Hope that helps.

NOTE: the advantage of that approach is that if your raw data has an escaped backslash (i.e. two backslashes in a row), it will do the right thing and replace it with one backslash rather than blindly removing all backslashes.

NOTE: this is probably also possible using sed.

View solution in original post

Motivator

Try this:

| gentimes start=-1 | eval url="http:\/\/pbs.twimg.com\/media\/CoyGo5cUsAEmIZF.jpg"  | rex mode=sed field=url "s/\\\//g"

You may also need to use the urldecode command for some urls (|eval url=urldecode(url)).

Builder

This worked as well! two ways to skin this one! thanks!

0 Karma

Contributor

This works for me in the search window:

| eval yourfieldname=replace(yourfieldname,"\\\\(.)","\1")

EDIT: a few words of explanation... the string "\\\\(.)" actually corresponds to the regex \\(.) which will match a single backslash followed by any character. The backslash has to be escaped once for the regex and another time to be in a double-quoted string, hence why one becomes four. If you're using the regex in a .conf file, depending how you do it, you don't need to escape it twice. Hope that helps.

NOTE: the advantage of that approach is that if your raw data has an escaped backslash (i.e. two backslashes in a row), it will do the right thing and replace it with one backslash rather than blindly removing all backslashes.

NOTE: this is probably also possible using sed.

View solution in original post

Builder

this worked, thanks!

0 Karma