Getting Data In

help with CSV inputs - utf16-le and header input problems

bowesmana
SplunkTrust
SplunkTrust

I have a utf-16 CSV file with a 0xFFFE byte order mark and the csv field names in the first line.

I have defined the charset for that input type to be utf-16le, which is fine, however, it extracts the field names incorrectly, e.g. the field name 'Company' is shown as

x00C_x00o_x00m_x00p_x00a_x00n_x00y_x00

I have tried various ways to fix this,

Firstly, skipping the first line using the following in props.conf

PREAMBLE_REGEX = \ufffe
FIELD_NAMES = id,username,firstname,lastname,company,time,ipaddress

but I then get no named fields at all and the first line is an indexed record.

Secondly, trying FIELD_HEADER_REGEX, also no luck

Edit: I also tried the preamble_REGEX as \ufeff, I also removed the BOM but still it creates the incorrectly decoded utf16 field names.

I converted the file to utf-8 and it's fine, but that's not a practical solution in the live environment.

Anyone got utf-16 + csv working?

Tags (1)

sylim_splunk
Splunk Employee
Splunk Employee

The fix for the SPL-78590 has been released in 6.0.2+. Thank you.

0 Karma

rizzo75
Path Finder

I am having the same problem. Did you ever find a solution?

Thanks,
Joe

0 Karma

rizzo75
Path Finder

I opened a case with Splunk. Bug SPL-78590 has been logged regarding this issue. It seems like the field extraction is not taking into the consideration regarding CHARSET specified.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

No Splunk solution, but I converted the files to utf-8, depending on your platform, I used

iconv -f utf16-le -t utf-8

to convert the file.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Yes, values are fine, just the header names are wrong.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Just want to be certain, the field values are decoded okay, but the field names are not, is that correct?

0 Karma
Get Updates on the Splunk Community!

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Your Next Big Security Credential: No Prerequisites Needed We know you’ve got the skills, and now, earning the ...

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

This is the sixth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Answers Content Calendar, July Edition I

Hello Community! Welcome to another month of Community Content Calendar series! For the month of July, we will ...