Splunk Search

Trouble indexing special characters in UTF-8

nc_lks
Engager

Hi Splunk community!

I'm trying to index a CSV file where multiple values contains special characters such as æøå and | (vertical bar).
The problem resides in characters such as these being indexed as '\xF8', '\xE6' and the like, as well as some strings having '?' inserted as the first and/or last character.

When I open the file using Notepad++ and/or Sublime Text, the special characters appear correctly.
Also, in Notepad++ it writes the encoding as: UTF-8-BOM.
I also tried checking the encoding with a *nix machine using the file command to which I received the result:
Filename.csv: UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators.

I have tried configuring my props.conf  for the input with both:
- CHARSET=AUTO
- CHARTSET=UTF-8
But none of these seems to solve my issue...

I have also tried exporting my CSV file as Unicode where I tried indexing with charset set to AUTO and UCS-2LE, which resulted in manyof lines being interpreted as chinese symbols.

 

Might someone have experienced and solved something similar?

Labels (1)
0 Karma
1 Solution

Vardhan
Contributor

Hi @nc_lks ,

 To resolve this issue first take the data and  ingest in splunk through Add-Data option then go to advanced settings and select charset and try all encoding languages one will definitely work.

View solution in original post

0 Karma

Vardhan
Contributor

Hi @nc_lks ,

 To resolve this issue first take the data and  ingest in splunk through Add-Data option then go to advanced settings and select charset and try all encoding languages one will definitely work.

0 Karma

nc_lks
Engager

Hi @Vardhan,

Thank you very much for the answer - I hadn't actually thought of that...

I found the flaw as an encoding error for a specific file at some point in the pipeline.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Future of Data Search & Exploration

You have more data coming at you than ever before. Over the next five years, the total amount of digital data ...

What’s new on Splunk Lantern in August

This month’s Splunk Lantern update gives you the low-down on all of the articles we’ve published over the past ...

This Week's Community Digest - Splunk Community Happenings [8.3.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...