Splunk Dev

Why is Splunk indexing our data in the wrong character encode?

ankithreddy777
Contributor

Splunk is indexing events in wrong format.

On Splunk forwarder, I am seeing these errors:

WARN  UTF8Processor - Using charset UTF-8, as the monitor is believed over the raw text which may be UTF-16LE - data_source="C:\Program Files\SplunkUniversalForwarder\var\log\XXX.log", data_host="xxx", data_sourcetype="config"

A few events are indexed in the below format:

\xFF\xFEC\x00:\x00\\x00P\x00r\x00o

The input file data is in proper format which is output of Splunk btool cmd copied to file and ingested to Splunk.

May I know how can we handle this?

0 Karma

VSIRIS
Path Finder
Hi Splunkers,
I have logs like

<Header>
<Product>Microsoft SQL Server Reporting Services Version 2011.0110.6615.02 ((SQL11_SP3_QFE-CU).180109-2116 )</Product>
<Locale>English ()</Locale>
<TimeZone>Central Daylight Time</TimeZone>
<Path>D:\Program Files\Microsoft SQL Server\MSRS11.CTSSRS2012\Reporting Services\Logfiles\ReportServerService__11_05_2020_14_52_11.log</Path>
<SystemName>Avotrix69901</SystemName>
<OSName>Microsoft Windows NT 6.2.9200</OSName>
<OSVersion>6.2.9200</OSVersion>
<ProcessID>3296</ProcessID>
<Virtualization>Hypervisor</Virtualization>
</Header>
<ProcessorArchitecture>AMD64</ProcessorArchitecture>
<ApplicationArchitecture>AMD64</ApplicationArchitecture>
processing!ReportServer_0-51!1ed8!11/05/2020-14:52:11:: v VERBOSE: Mapping data reader successfully initialized.
library!ReportServer_0-51!2bc8!11/05/2020-14:52:11:: v VERBOSE: Transaction commit.
processing!ReportServer_0-51!1ed8!11/05/2020-14:52:11:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: , Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 3.;
runningjobs!ReportServer_0-51!2bc8!11/05/2020-14:52:11:: v VERBOSE: Thread pool settings: Available worker: 399, Max worker: 400, Available IO: 400, Max IO: 400
runningjobs!ReportServer_0-51!2bc8!11/05/2020-14:52:11:: v VERBOSE: Spawning new thread for a work item.
runningjobs!ReportServer_0-51!2bc8!11/05/2020-14:52:11:: v VERBOSE: ThreadJobContext.EndCancelableState
runningjobs!ReportServer_0-51!2bc8!11/05/2020-14:52:11:: v VERBOSE: ThreadJobContext.WaitForCancelException entered
runningjobs!ReportServer_0-51!2bc8!11/05/2020-14:52:11:: v
 
And after indexing i am getting events like
\x00c\x00h\x00u\x00n\x00k\x00s\x00!\x00R\x00e\x00p\x00o\x00r\x00t\x00S\x00e\x00r\x00v\x00e\x005\x001\x00!\x002\x001\x00d\x000\x00!\x001\x001\x00/\x000\x005\x00/\x002\x000\x002\x000\x00-\x001\x004\x00:\x005\x002\x00:\x001\x002\x00:\x00:\x00 \x00v\x00 \x00V\x00E\x00R\x00B\x00O\x00S\x00E\x00:\x00 \x00R\x00e\x00t\x00r\x00i\x00e\x00v\x00e\x00d\x00 \x00s\x00e\x00g\x00m\x00e\x00n\x00t\x00 \x004\x003\x00f\x00b\x000\x009\x009\x00d\x00-\x00c\x006\x006\x004\x00-\x00e\x00a\x001\x001\x00-\x008\x001\x002\x00d\x00-\x000\x000\x002\x001\x005\x00a\x009\x00b\x000\x008\x00a\x00c\x00 \x00f\x00o\x00r\x00 \x00c\x00h\x00u\x00n\x00k\x00 \x004\x002\x00f\x00b\x000\x009\x009\x00d\x00-\x00c\x006\x006\x004\x00-\x00e\x00a\x001\x001\x00-\x008\x001\x002\x00d\x00-\x000\x000\x002\x001\x005\x00a\x009\x00b\x000\x008\x00a\x00c\x00 \x00f\x00r\x00o\x00m\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00g\x00m\x00e\x00n\x00t\x00 



I had solved this issue using the below settings in props.conf


[MyOwnSourceType]
CHARSET = UTF16-LE
0 Karma

dkeck
Influencer

HI,

did you try to set the charset for your sourcetype?

Usually if you change the CHARSET option in props.conf this will be fixed.
Also be aware that the CHARSET option must be set on the UF or at input level - see more here http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings

Could be that you have to set it on indexer and UF, not sure about that, just try (https://answers.splunk.com/answers/106700/seing-null-x00-bytes-in-indexed-data-from-log-file-in-wind...)

Would be someting like :

[<sourcetype>]
CHARSET = UTF16-LE
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In the last month, the Splunk Threat Research Team (STRT) has had 2 releases of new security content via the ...

Announcing the 1st Round Champion’s Tribute Winners of the Great Resilience Quest

We are happy to announce the 20 lucky questers who are selected to be the first round of Champion's Tribute ...

We’ve Got Education Validation!

Are you feeling it? All the career-boosting benefits of up-skilling with Splunk? It’s not just a feeling, it's ...