(Sorry for my English, I'm French)
I have new systems that send Syslog to Splunk UniversalFowarder via 40514 port. The UF listen and I receive Syslog (its ok).
When I test (logger) and check configuration of all systems, we have LANG=en_CA.UTF8.
My Charset parameter is set to AUTO and I dont understand why the systems replaces characters by .... see exemple.
root@SERVERX:/etc/rsyslog.d# logger -plocal0.info "Test logger 61"
root@SERVERX:/etc/rsyslog.d# logger -plocal0.info "Test logger 62 with accent è é à"
Apr 17 16:15:08 10.62.1.140 Apr 17 16:15:08 SERVERX root: Test logger 62 with accent \xE8 \xE9 \xE0
date_hour = 16 date_mday = 17 date_minute = 15 date_month = april date_second = 8 date_wday = friday date_year = 2015 date_zone = local host = 192.168.80.210 index = IndexLav linecount = 1 punct = ::_...:::______\ source = udp:40514 sourcetype = udp:40514 splunk_server = SERVERY splunk_server_group = dmc_group_indexer timeendpos = 16 timestartpos = 0 unix_category = all_hosts unix_group = default
Anyone know how to fix this?
Thank you in advance.
I'm not sure, but I suspect that Splunk is guessing UTF-8 but the log is not that format. For example: é
is Unicode code point U+00E9, which in UTF-8 is 2 bytes: 0xC3 0xA9 But here it looks like you have a substitution for a single byte E9 which makes me believe it's actually ISO-8859-1 or another character set with a similar mapping.
I made a Pcap and I visualized with Wireshark:
Syslog message: LOCAL0.INFO: Apr 15 19:45:53 SERVERx root: Test logger 16 accent \350 \351 \340
1000 0... = Facility: LOCAL0 - reserved for local use (16)
.... .110 = Level: INFO - informational (6)
Message: Apr 15 19:45:53 SERVERx root: Test logger 16 accent \350 \351 \340
(backslash before 350, 351 and 340)