Splunk Search

How to properly handle backslashes in data

richardAtOmni
Path Finder

We use the HttpEventListener to input data into splunk. Our data is pipe ('|') delimited and we have setup field extractors using the "delimited" method. This works perfectly so long as there are no backslashes or pipes in our data. If there are backslashes or pipes, we've observed some odd behavior in Splunk.

  1. If there is a field that contains a double backslash, just as part of the data, such as could occur if we were to log a windows UNC path (i.e. "The network path path is: \serverA\folder1\subfolder"), in the "_raw" view, the data will look correct, but in the extracted field, Splunk will collapse the double backslash into a single backslash. My example message above would look like this: "The network path path is: \serverA\folder1\subfolder".

This only happens in the extracted field however. In the _raw view, all the extra slashes are retained, which isn't ideal for our purposes. Is there a way to have both the raw view and the extracted field match showing the intended number of slashes?

  1. If a particular field ends in a backslash, it causes the following separator pipe character to become escaped and included in the field instead of treated as a separator, throwing off all the delimited field mappings.

  2. However if a field has a pipe character in it's actual data we can use the backslash to escape it to prevent it from being treated as a delimiter. This is great. However, the problem here is that in the extracted field, splunk still shows us the backslash character as though it was part of the data.

For example, our _raw could be abc|def|12345 representing 2 fields, with values "abc|def" and "12345". But if we look at the extracted field value for field 1, we don't see "abc|def" we see "abc|def". The slash is used as an escape, but it is not removed from the result. This is inconsistent with what happens when the slash escapes another slash.

This behavior has us scratching our heads as to how to properly handle special characters. Any advice would be greatly appreciated.

akshatj2
Path Finder

Hi Richard,

Did you find a solution for this I am facing similar challenge with my data

0 Karma

richardAtOmni
Path Finder

Unfortunately, we never did find a solution to this.

0 Karma

jkat54
SplunkTrust
SplunkTrust

When extracting the field you may choose to remove the backslash or not.

For your first example, it appears you've extracted everything AFTER the first slash such as this:

[sourcetypeName]
EXTRACT-uncPATH = \/(?<uncPATH>.+)

If you changed that to be this, it would extract both slashes

[sourcetypeName]
EXTRACT-uncPATH = (?<uncPATH>\/\/.+)

You can always add it back in your search:

| makeresults count=1 | eval uncPath="/servername/share/" | rex mode=sed field=uncPath "s/$\//\/\//g"

Or remove it:

| makeresults count=1 | eval fieldName="abcd\|def" | rex mode=sed field=fieldName"s/\\//g"

Of course in your case you will not use eval or makeresults.

richardAtOmni
Path Finder

Thanks for your suggestion, but I don't think this quite addresses our scenario. I think your suggestion is to edit the regex used to extract the field to either include or exclude the slashes.

Unfortunately, in this case, we are not using the regex extraction here. We are using a delimited extraction, with pipes as the delimiter. The field is just a text string with a log message, and sometimes we will output a UNC path in the log message. We want the UNC path to show up correctly in the resulting "message" field. And we don't necessarily want to extract a separate field just for these scenarios.

So we are looking for suggestions to make the delimited extraction extract properly with respect to these slashes.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...