Splunk is indexing events in wrong format.
On Splunk forwarder, I am seeing these errors:
WARN UTF8Processor - Using charset UTF-8, as the monitor is believed over the raw text which may be UTF-16LE - data_source="C:\Program Files\SplunkUniversalForwarder\var\log\XXX.log", data_host="xxx", data_sourcetype="config"
A few events are indexed in the below format:
\xFF\xFEC\x00:\x00\\x00P\x00r\x00o
The input file data is in proper format which is output of Splunk btool cmd copied to file and ingested to Splunk.
May I know how can we handle this?
HI,
did you try to set the charset for your sourcetype?
Usually if you change the CHARSET option in props.conf this will be fixed.
Also be aware that the CHARSET option must be set on the UF or at input level - see more here http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings
Could be that you have to set it on indexer and UF, not sure about that, just try (https://answers.splunk.com/answers/106700/seing-null-x00-bytes-in-indexed-data-from-log-file-in-wind...)
Would be someting like :
[<sourcetype>]
CHARSET = UTF16-LE