Splunk Search

Strange character escaping in eval and rex

Path Finder

I am breaking my head over this.

Sometimes our users login to our web application using username: "myuser" or "mydomain\myuser". It screws up the results for "stats", because myuser and mydomain\myuser are taken as two different users. I need to remove "mydomain\" string from the username. Here is what I am doing:

Search:

source="/var/log/iis" myserver.mydomain.com | eval username=lower(username) | eval username=replace(username,"mydomain\\\\","") | stats count by username | sort -count 

gets broken with error message, because splunk thinks that I am escaping double quotes, instead of \ sign.

When I take "\" out of the statement:

source="/var/log/iis" myserver.mydomain.com | eval username=lower(username) | eval username=replace(username,"mydomain","") | stats count by username | sort -count 

it returns:

\\myuser
myuser

How can I get rid of the damn backslash???? I am surprised that splunk matches from the right side instead of from the left. Statement "\\" should escape \ sign and not double quotes.

Same thing happens if I try to extract "myuser" from the username with rex:

rex field=_raw "^client\\\\(?<user>.*)"

It gets broken thinking that I am escaping the parenthesis.

Very strange and very frustrating.

It would be nice if Splunk developers included "chr(ascii-code)" command, when any character in the search string could be replaced with ASCII code at places, where the escaping nonsense happens.

Tags (1)
1 Solution

Splunk Employee
Splunk Employee

Splunk regexes are PCRE, which does allow you to specify a character by codepoint. See: http://perldoc.perl.org/perlre.html#Regular-Expressions :

\x{}, \x00  character whose ordinal is the given hexadecimal number
\N{name}    named Unicode character or character sequence
\N{U+263D}  Unicode character     (example: FIRST QUARTER MOON)
\o{}, \000  character whose ordinal is the given octal number

So this works:

| stats count | eval f="mydomain\myname" | eval g=replace(f,"^mydomain\\x5c","")

But in addition, this works perfectly for me:

| stats count | eval f="mydomain\myname" | eval g=replace(f,"^mydomain\\\\","")

returning g as myname, so I'm not sure why you have the problem. Note that in the Splunk search string, backslashes that you want to have as part of a regex must themselves be escaped with a backslash. The resulting regex that is actually applied in the above examples then are ^mydomain\x5c and ^mydomain\\

I wonder what version of Splunk you're on and if there was a bug that was fixed. However, I'm also not sure that the search you provided in your question was correct, as I don't know if you typed extra backslashes in your search string to make it display right, or if you pasted in unchanged. (I edited your question on the assumption that you had pasted the literal string without editing.)

View solution in original post

Splunk Employee
Splunk Employee

Splunk regexes are PCRE, which does allow you to specify a character by codepoint. See: http://perldoc.perl.org/perlre.html#Regular-Expressions :

\x{}, \x00  character whose ordinal is the given hexadecimal number
\N{name}    named Unicode character or character sequence
\N{U+263D}  Unicode character     (example: FIRST QUARTER MOON)
\o{}, \000  character whose ordinal is the given octal number

So this works:

| stats count | eval f="mydomain\myname" | eval g=replace(f,"^mydomain\\x5c","")

But in addition, this works perfectly for me:

| stats count | eval f="mydomain\myname" | eval g=replace(f,"^mydomain\\\\","")

returning g as myname, so I'm not sure why you have the problem. Note that in the Splunk search string, backslashes that you want to have as part of a regex must themselves be escaped with a backslash. The resulting regex that is actually applied in the above examples then are ^mydomain\x5c and ^mydomain\\

I wonder what version of Splunk you're on and if there was a bug that was fixed. However, I'm also not sure that the search you provided in your question was correct, as I don't know if you typed extra backslashes in your search string to make it display right, or if you pasted in unchanged. (I edited your question on the assumption that you had pasted the literal string without editing.)

View solution in original post

Path Finder

Fantastic answer!!! Thanks a lot, gkanapathy!

Splunk Employee
Splunk Employee

please when you're putting in character specific items, use the <code>, <pre>, or indent the line 4 spaces and use the preview to ensure it displays right. otherwise it's nearly impossible to tell what you actually entered, especially if your question involves escaped characters like here.

0 Karma

Path Finder

I somewhat solved the issue by putting "." character after mydomain and putting "^" character in the beginning of the string. Here is the search:

source="/var/log/iis" myserver.mydomain.com | eval username=lower(username) | eval username=replace(username,"^mydomain.","") | stats count by username | sort -count 

Though it does work, it is not elegant solution, since it will remove a user "client1" if it exists in AD. Splunk developers PLEASE address the issue of escaping a backslash in search string. I BEG YOU...