Splunk Search

Fixing Splunk Apostrophe Errors: \x92s

mhtedford
Communicator

I have a data set of survey responses based on video conference call connection type.

One of the possible survey responses is " I used WebEx’s ‘Call Using Computer’ "

The apostrophes that surround Call Using Computer are messing up my data, as Splunk is reading these as separate events:
alt text

Here you can see the stats count for each response:
alt text

I'm not sure how to fix this issue, as if I just summed them all together the number would be incorrectly high.

I don't know if I should delete one of the responses or combine them or somehow create an average.

Please advise 🙂

jkat54
SplunkTrust
SplunkTrust

If CHARSET=CP1252 in props.conf doesnt work, try using sedcmd in props or rex in search.

[sourcetypeName]
CHARSET=CP1252

or

[sourcetypeName]
SEDCMD-x91=s/\\\\x91//g
SEDCMD-x92=s/\\\\x92//g
SEDCMD-xD5=s/\\\\xD5//g
SEDCMD-xD4=s/\\\\xD4//g
SEDCMD-underscore=s/\_//g
# this will just delete them all, you could change //g to /'/g to add straight single quotes instead.

SEARCH: ...| rex mode=sed "s/\\\\x9[1-2]//g" | rex mode=sed "s/\\\\xD[4-5]//g" | rex mode=sed "s/\_//g" | ....
0 Karma

jkat54
SplunkTrust
SplunkTrust

you could anticipate all possible hex values like this too

s/\\x[A-F0-9]{2}//g

0 Karma

ddrillic
Ultra Champion

91 and 92 are the hex codes for open and close curly apostrophe (single quote) in the MS Windows default version of the latin1/ISO-8859-1... from - Vim shows strange characters <91>,<92>

The CP1252 (Windows 1252 encoding) info. The windows 1252 codepage, also called Latin 1, is used by the windows operating system to display a number of latin based languages.

So, this CP1252 encoding always causes issues because its origin, latin1, is the base for utf-8 when moving from a single byte in latin1 to multi-byte representation of a character. CP1252, on the other hand is not compatible with utf-8.

Somewhere in the process, you need to convert this character to a utf-8 based character....

0 Karma

mattymo
Splunk Employee
Splunk Employee

i wonder if charset= in props.conf would work... Splunk supports many encoding schemes...

0 Karma

ddrillic
Ultra Champion

Right, but here it's not a character, it's a sequence which holds the hex code value of the character -

![alt text][1]

0 Karma

mattymo
Splunk Employee
Splunk Employee

is it possible we changed it to the hex on ingest because its set to utf-8?

Hard to know without seeing the raw data I would assume...

https://docs.splunk.com/Documentation/Splunk/latest/Data/Configurecharactersetencoding

0 Karma

mattymo
Splunk Employee
Splunk Employee

This is very much in line with your other question we worked on. You need to work on the props.conf and account for these items, just like the date extractions in your other post.

Props.conf has a config called

FIELD_QUOTE = 
* Tells Splunk the character to use for quotes in the specified file or
  source.
* This attribute supports the use of the special characters described above.

but im not sure how it will act with a quote mid field...i will try...

https://docs.splunk.com/Documentation/Splunk/6.6.2/Admin/Propsconf

From the screenshot above, it doesn't seem to be breaking your events incorrectly, but does appear to be dealing with the characters in a few different ways...Ideally those fields would be encapsulated in quotes to protect the string.does the raw data have double quotes surrounding the string?..do you have the ability to configure those strings??

Again if you can post a couple of the troubled raw events we can help get you sorted.

mhtedford
Communicator

@mmodestino

There are three different 'Call Using Computer' events that are being generated:

  • I used WebEx\x92s \x91Call Using Computer\x92
  • I used WebEx\xD5s \xD4Call Using Computer\xD5
  • I used WebEx_s Call Using Computer

Here is a picture of the first: http://imgur.com/a/r7Cfw
Here is some of the raw data:

Very Dissatisfied,1,-,-,-,-,,,,,,,,,,,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,Yes,Venezuela,6/22/2017 2:25:51 PM,6/22/2017 2:25:51 PM
Dissatisfied,2,I had difficulty joining the audio for the meeting.,I had audio quality problems during my meeting.,-,-,-,-,Using \x91Call Using Computer\x92,-,-,Delays,-,,,,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,Yes,United States,6/22/2017 10:09:11 AM,6/22/2017 10:09:11 AM
Dissatisfied,2,-,I had audio quality problems during my meeting.,-,-,,,,-,Noise or static,-,-,,,,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,Yes,Portugal,6/22/2017 8:52:54 AM,6/22/2017 8:52:54 AM
Very Dissatisfied,1,I had difficulty joining the audio for the meeting.,I had audio quality problems during my meeting.,-,-,-,-,Using \x91Call Using Computer\x92,-,-,-,Call drops/disconnects,,,,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,No,Turkey,6/22/2017 8:05:39 AM,6/22/2017 8:05:39 AM
Very Dissatisfied,1,-,-,I had difficulty connecting to the WebEx application.,-,,,,,,,,Application never loaded,-,-,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,Yes,France,6/22/2017 7:18:25 AM,6/22/2017 7:18:25 AM
Dissatisfied,2,-,I had audio quality problems during my meeting.,-,-,,,,-,Noise or static,Delays,-,,,,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,Yes,Poland,6/22/2017 6:18:04 AM,6/22/2017 6:18:04 AM
Very Dissatisfied,1,I had difficulty joining the audio for the meeting.,-,I had difficulty connecting to the WebEx application.,-,-,-,-,,,,,Application never loaded,-,-,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,No,United States,6/21/2017 3:05:23 PM,6/21/2017 3:05:23 PM
Dissatisfied,2,-,-,I had difficulty connecting to the WebEx application.,-,,,,,,,,Application never loaded,-,-,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,Yes,Costa Rica,6/21/2017 12:25:35 PM,6/21/2017 12:25:35 PM
Dissatisfied,2,-,-,I had difficulty connecting to the WebEx application.,-,,,,,,,,Application never loaded,Application took more than 60 seconds to load,-,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,No,India,6/21/2017 11:30:37 AM,6/21/2017 11:30:37 AM
Very Dissatisfied,1,-,-,I had difficulty connecting to the WebEx application.,-,,,,,,,,-,Application took more than 60 seconds to load,-,,,,,,I used WebEx\x92s \x91Call Using Computer\x92,No,Germany,6/21/2017 9:10:18 AM,6/21/2017 9:10:18 AM

Here is a picture of the second: http://imgur.com/a/acmWG
Here is some of the raw data:

Very Dissatisfied,1,-,I had audio quality problems during my meeting. ,-,-,,,,-,-,Delays,-,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,India,4/28/2017 0:22,4/28/2017 0:22
Very Dissatisfied,1,I had difficulty joining the audio for the meeting.,I had audio quality problems during my meeting.,-,-,-,-,Using \xD4Call Using Computer\xD5,Echo,Noise or static,Delays,-,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,India,5/3/2017 12:32:25 AM,5/3/2017 12:32:25 AM
Very Dissatisfied,1,I had difficulty joining the audio for the meeting.,-,I had difficulty connecting to the WebEx application.,-,-,-,-,,,,,-,-,-,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,No,India,5/7/2017 11:34:02 PM,5/7/2017 11:34:02 PM
Dissatisfied,2,-,I had audio quality problems during my meeting.,-,-,,,,-,Noise or static,-,-,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,Costa Rica,5/5/2017 3:14:10 PM,5/5/2017 3:14:10 PM
Very Dissatisfied,1,-,-,I had difficulty connecting to the WebEx application.,-,,,,,,,,Application never loaded,-,-,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,United States,5/5/2017 2:26:35 PM,5/5/2017 2:26:35 PM
Dissatisfied,2,-,I had audio quality problems during my meeting.,-,-,,,,Echo,Noise or static,-,-,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,United States,5/5/2017 11:27:02 AM,5/5/2017 11:27:02 AM
Very Dissatisfied,1,-,I had audio quality problems during my meeting.,-,-,,,,-,Noise or static,-,Call drops/disconnects,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,France,5/5/2017 5:34:04 AM,5/5/2017 5:34:04 AM
Dissatisfied,2,I had difficulty joining the audio for the meeting.,-,I had difficulty connecting to the WebEx application.,-,Using \xD4Call Me\xD5,-,Using \xD4Call Using Computer\xD5,,,,,-,Application took more than 60 seconds to load,-,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,Yes,Indonesia,5/4/2017 11:33:23 PM,5/4/2017 11:33:23 PM
Dissatisfied,2,-,I had audio quality problems during my meeting.,-,-,,,,-,-,-,Call drops/disconnects,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,No,United States,5/4/2017 10:03:02 PM,5/4/2017 10:03:02 PM
Dissatisfied,2,I had difficulty joining the audio for the meeting.,-,-,-,-,-,Using \xD4Call Using Computer\xD5,,,,,,,,,,,,,I used WebEx\xD5s \xD4Call Using Computer\xD5,No,Per\x9C,5/4/2017 1:02:36 PM,5/4/2017 1:02:36 PM

Here is a picture of the third: http://imgur.com/a/vsI7H
Here is the raw data:

Dissatisfied,2,-,I had audio quality problems during my meeting. ,-,-,,,,Echo,Noise or static,-,-,,,,,,,,,I used WebEx_s _Call Using Computer_,Yes,Germany,4/5/2017 7:33,4/5/2017 7:33
Very Dissatisfied,1,-,-,I had difficulty connecting to the WebEx application.,-,,,,,,,,Application never loaded,-,-,,,,,,I used WebEx_s _Call Using Computer_,No,Italy,4/5/2017 7:25,4/5/2017 7:25
Dissatisfied,2,I had difficulty joining the audio for the meeting.,I had audio quality problems during my meeting. ,I had difficulty connecting to the WebEx application.,-,-,-,Using _Call Using Computer_,Echo,Noise or static,-,-,-,Application took more than 60 seconds to load,-,,,,,,I used WebEx_s _Call Using Computer_,Yes,China,4/5/2017 3:59,4/5/2017 3:59
Dissatisfied,2,-,I had audio quality problems during my meeting. ,-,-,,,,Echo,Noise or static,Delays,-,,,,,,,,,I used WebEx_s _Call Using Computer_,Yes,Japan,4/5/2017 3:45,4/5/2017 3:45
0 Karma

mattymo
Splunk Employee
Splunk Employee

is this the raw data directly from the webex file?

0 Karma

woodcock
Esteemed Legend

The right way to fix this is to fix your line_breaking as the events are being indexed (not after the fact at search-time). Are looking for a quick search-time workaround or a proper index-time fix? If the latter, we need to see the props.conf for this input from your indexers.

mhtedford
Communicator

@woodcock thanks for the response.

I would like to pursue the proper index-time fix.

Where do I find the props.conf?

0 Karma

woodcock
Esteemed Legend

Go to CLI on one of your Indexers and type this:

splunk btool props list --debug

Then look for your sourcetype and post everything that comes after it

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!