Splunk Search

Issue with á é í ó ú characters in Logs

Path Finder

Hello everybody.
I´m having troubles managing logs that contains strings in spanish that has tilde (ó, á) characters, like Gestión, Emisión, módulo, etc.

When indexing, this strings become like "Gesti\xF3n", "Emisi\xF3n" ""M\xF3dulo" .

What should I do in order to have the log information accourately as registered before indexing?
Many thanks in advance and have a great new years eve.

Splunk Employee
Splunk Employee

Is this the problem you are seeing?

SplunkTrust
SplunkTrust

What is the log file's character encoding? Perhaps ISO-8859-1? If it is NOT UTF-8, then you will need to configure Splunk with the proper CHARSET= in props.conf. Assuming the file is actually using ISO-8859-1 (a common encoding for Western European languages), then something like this should work:

[source::/var/log/stuff/spanish.log]
CHARSET=ISO-8859-1

SplunkTrust
SplunkTrust

poke Any news?

0 Karma

SplunkTrust
SplunkTrust

did you ever have any success finding the proper character encoding?

0 Karma

SplunkTrust
SplunkTrust

The language may be Spanish, but I feel pretty confident that the character encoding is ISO-8859-1. Splunk's default is UTF-8, so setting UTF-8 explicitly won't help (and we already know it's not UTF-8). If you look at http://en.wikipedia.org/wiki/ISO/IEC_8859-1, you can see that '0xF3' is 'ó' in ISO-8859-1 encoding.

Path Finder

I use sourcetype stanza instead of source since it is a daily source file instead of only one file.

0 Karma

Path Finder

Thanks for your answer. The log It's not Western Eurepean, it is in Spanish. I tried adding
[sourcetype_Name]
CHARSET=UTF-8
on props.conf but doesn´t work.