Getting Data In

help with CSV inputs - utf16-le and header input problems

bowesmana
SplunkTrust
SplunkTrust

I have a utf-16 CSV file with a 0xFFFE byte order mark and the csv field names in the first line.

I have defined the charset for that input type to be utf-16le, which is fine, however, it extracts the field names incorrectly, e.g. the field name 'Company' is shown as

x00C_x00o_x00m_x00p_x00a_x00n_x00y_x00

I have tried various ways to fix this,

Firstly, skipping the first line using the following in props.conf

PREAMBLE_REGEX = \ufffe
FIELD_NAMES = id,username,firstname,lastname,company,time,ipaddress

but I then get no named fields at all and the first line is an indexed record.

Secondly, trying FIELD_HEADER_REGEX, also no luck

Edit: I also tried the preamble_REGEX as \ufeff, I also removed the BOM but still it creates the incorrectly decoded utf16 field names.

I converted the file to utf-8 and it's fine, but that's not a practical solution in the live environment.

Anyone got utf-16 + csv working?

Tags (1)

sylim_splunk
Splunk Employee
Splunk Employee

The fix for the SPL-78590 has been released in 6.0.2+. Thank you.

0 Karma

rizzo75
Path Finder

I am having the same problem. Did you ever find a solution?

Thanks,
Joe

0 Karma

rizzo75
Path Finder

I opened a case with Splunk. Bug SPL-78590 has been logged regarding this issue. It seems like the field extraction is not taking into the consideration regarding CHARSET specified.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

No Splunk solution, but I converted the files to utf-8, depending on your platform, I used

iconv -f utf16-le -t utf-8

to convert the file.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Yes, values are fine, just the header names are wrong.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Just want to be certain, the field values are decoded okay, but the field names are not, is that correct?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...