Getting Data In

Charset Encoding

kenchisho
Path Finder

Hi guys,

I have installed Splunk 4.3 on a MAC OSX 10.7.

I am trying to index data with non utf encoding. I have tried pretty much every encoding available with splunk without any luck... the non unicode characters get replaced with some other symbols.

Example

in my log files i have "DAVOR ĆORIĆ" and it gets indexed as "DAVOR žORIž" or some other symnbol depending on which charset i use with this sourcetype... I never get the correct data indexed...

Has anyone had similar problem... and possibly a simple solution?

0 Karma

jbsplunk
Splunk Employee
Splunk Employee

Here is a list of supported character sets, and instructions on how to apply them to data:

http://docs.splunk.com/Documentation/Splunk/latest/data/Configurecharactersetencoding

jbsplunk
Splunk Employee
Splunk Employee

If you open the file with a tool like text wrangler, what does it detect as the charset? I've found that to be pretty reliable in troubleshooting these kinds of issues.

0 Karma

kenchisho
Path Finder

Hi jbsplunk,

thanks for the quick reply.

I have tried seting the charset manualy but splunk still garbles up the data when indexing. I have tried pretty much all the charsets available with splunk. Usualy with this type of data i use CP1250 and all goes well but with this set of data it is a no go with any charset config...

I have tried this with a linux install of splunk, thinking it might be an OSX related issue, and get the same results...

I am geusing this might be a bug but am not quite sure yet...

0 Karma
Get Updates on the Splunk Community!

Celebrating Fast Lane: 2025 Authorized Learning Partner of the Year

At .conf25, Splunk proudly recognized Fast Lane as the 2025 Authorized Learning Partner of the Year. This ...

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...