Getting Data In

Charset Encoding

Path Finder

Hi guys,

I have installed Splunk 4.3 on a MAC OSX 10.7.

I am trying to index data with non utf encoding. I have tried pretty much every encoding available with splunk without any luck... the non unicode characters get replaced with some other symbols.

Example

in my log files i have "DAVOR ĆORIĆ" and it gets indexed as "DAVOR žORIž" or some other symnbol depending on which charset i use with this sourcetype... I never get the correct data indexed...

Has anyone had similar problem... and possibly a simple solution?

0 Karma

Splunk Employee
Splunk Employee

Here is a list of supported character sets, and instructions on how to apply them to data:

http://docs.splunk.com/Documentation/Splunk/latest/data/Configurecharactersetencoding

Splunk Employee
Splunk Employee

If you open the file with a tool like text wrangler, what does it detect as the charset? I've found that to be pretty reliable in troubleshooting these kinds of issues.

0 Karma

Path Finder

Hi jbsplunk,

thanks for the quick reply.

I have tried seting the charset manualy but splunk still garbles up the data when indexing. I have tried pretty much all the charsets available with splunk. Usualy with this type of data i use CP1250 and all goes well but with this set of data it is a no go with any charset config...

I have tried this with a linux install of splunk, thinking it might be an OSX related issue, and get the same results...

I am geusing this might be a bug but am not quite sure yet...

0 Karma