Splunk Search

Trouble indexing special characters in UTF-8

nc_lks
Engager

Hi Splunk community!

I'm trying to index a CSV file where multiple values contains special characters such as æøå and | (vertical bar).
The problem resides in characters such as these being indexed as '\xF8', '\xE6' and the like, as well as some strings having '?' inserted as the first and/or last character.

When I open the file using Notepad++ and/or Sublime Text, the special characters appear correctly.
Also, in Notepad++ it writes the encoding as: UTF-8-BOM.
I also tried checking the encoding with a *nix machine using the file command to which I received the result:
Filename.csv: UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators.

I have tried configuring my props.conf  for the input with both:
- CHARSET=AUTO
- CHARTSET=UTF-8
But none of these seems to solve my issue...

I have also tried exporting my CSV file as Unicode where I tried indexing with charset set to AUTO and UCS-2LE, which resulted in manyof lines being interpreted as chinese symbols.

 

Might someone have experienced and solved something similar?

Labels (1)
0 Karma
1 Solution

Vardhan
Contributor

Hi @nc_lks ,

 To resolve this issue first take the data and  ingest in splunk through Add-Data option then go to advanced settings and select charset and try all encoding languages one will definitely work.

View solution in original post

0 Karma

Vardhan
Contributor

Hi @nc_lks ,

 To resolve this issue first take the data and  ingest in splunk through Add-Data option then go to advanced settings and select charset and try all encoding languages one will definitely work.

0 Karma

nc_lks
Engager

Hi @Vardhan,

Thank you very much for the answer - I hadn't actually thought of that...

I found the flaw as an encoding error for a specific file at some point in the pipeline.

0 Karma
Get Updates on the Splunk Community!

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...