Splunk Search

Trouble indexing special characters in UTF-8

nc_lks
Engager

Hi Splunk community!

I'm trying to index a CSV file where multiple values contains special characters such as æøå and | (vertical bar).
The problem resides in characters such as these being indexed as '\xF8', '\xE6' and the like, as well as some strings having '?' inserted as the first and/or last character.

When I open the file using Notepad++ and/or Sublime Text, the special characters appear correctly.
Also, in Notepad++ it writes the encoding as: UTF-8-BOM.
I also tried checking the encoding with a *nix machine using the file command to which I received the result:
Filename.csv: UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators.

I have tried configuring my props.conf  for the input with both:
- CHARSET=AUTO
- CHARTSET=UTF-8
But none of these seems to solve my issue...

I have also tried exporting my CSV file as Unicode where I tried indexing with charset set to AUTO and UCS-2LE, which resulted in manyof lines being interpreted as chinese symbols.

 

Might someone have experienced and solved something similar?

Labels (1)
0 Karma
1 Solution

Vardhan
Contributor

Hi @nc_lks ,

 To resolve this issue first take the data and  ingest in splunk through Add-Data option then go to advanced settings and select charset and try all encoding languages one will definitely work.

View solution in original post

0 Karma

Vardhan
Contributor

Hi @nc_lks ,

 To resolve this issue first take the data and  ingest in splunk through Add-Data option then go to advanced settings and select charset and try all encoding languages one will definitely work.

0 Karma

nc_lks
Engager

Hi @Vardhan,

Thank you very much for the answer - I hadn't actually thought of that...

I found the flaw as an encoding error for a specific file at some point in the pipeline.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...