Hi guys,
I have installed Splunk 4.3 on a MAC OSX 10.7.
I am trying to index data with non utf encoding. I have tried pretty much every encoding available with splunk without any luck... the non unicode characters get replaced with some other symbols.
Example
in my log files i have "DAVOR ĆORIĆ" and it gets indexed as "DAVOR žORIž" or some other symnbol depending on which charset i use with this sourcetype... I never get the correct data indexed...
Has anyone had similar problem... and possibly a simple solution?
Here is a list of supported character sets, and instructions on how to apply them to data:
http://docs.splunk.com/Documentation/Splunk/latest/data/Configurecharactersetencoding
If you open the file with a tool like text wrangler, what does it detect as the charset? I've found that to be pretty reliable in troubleshooting these kinds of issues.
Hi jbsplunk,
thanks for the quick reply.
I have tried seting the charset manualy but splunk still garbles up the data when indexing. I have tried pretty much all the charsets available with splunk. Usualy with this type of data i use CP1250 and all goes well but with this set of data it is a no go with any charset config...
I have tried this with a linux install of splunk, thinking it might be an OSX related issue, and get the same results...
I am geusing this might be a bug but am not quite sure yet...