Hello everybody.
I´m having troubles managing logs that contains strings in spanish that has tilde (ó, á) characters, like Gestión, Emisión, módulo, etc.
When indexing, this strings become like "Gesti\xF3n", "Emisi\xF3n" ""M\xF3dulo" .
What should I do in order to have the log information accourately as registered before indexing?
Many thanks in advance and have a great new years eve.
Is this the problem you are seeing?
What is the log file's character encoding? Perhaps ISO-8859-1? If it is NOT UTF-8, then you will need to configure Splunk with the proper CHARSET=
in props.conf
. Assuming the file is actually using ISO-8859-1 (a common encoding for Western European languages), then something like this should work:
[source::/var/log/stuff/spanish.log]
CHARSET=ISO-8859-1
poke Any news?
did you ever have any success finding the proper character encoding?
The language may be Spanish, but I feel pretty confident that the character encoding is ISO-8859-1. Splunk's default is UTF-8, so setting UTF-8 explicitly won't help (and we already know it's not UTF-8). If you look at http://en.wikipedia.org/wiki/ISO/IEC_8859-1, you can see that '0xF3' is 'ó' in ISO-8859-1 encoding.
I use sourcetype stanza instead of source since it is a daily source file instead of only one file.
Thanks for your answer. The log It's not Western Eurepean, it is in Spanish. I tried adding
[sourcetype_Name]
CHARSET=UTF-8
on props.conf but doesn´t work.