help with CSV inputs - utf16-le and header input p...

bowesmana · ‎10-15-2013

I have a utf-16 CSV file with a 0xFFFE byte order mark and the csv field names in the first line.

I have defined the charset for that input type to be utf-16le, which is fine, however, it extracts the field names incorrectly, e.g. the field name 'Company' is shown as

x00C_x00o_x00m_x00p_x00a_x00n_x00y_x00

I have tried various ways to fix this,

Firstly, skipping the first line using the following in props.conf

PREAMBLE_REGEX = \ufffe
FIELD_NAMES = id,username,firstname,lastname,company,time,ipaddress

but I then get no named fields at all and the first line is an indexed record.

Secondly, trying FIELD_HEADER_REGEX, also no luck

Edit: I also tried the preamble_REGEX as \ufeff, I also removed the BOM but still it creates the incorrectly decoded utf16 field names.

I converted the file to utf-8 and it's fine, but that's not a practical solution in the live environment.

Anyone got utf-16 + csv working?

sylim_splunk · ‎05-27-2015

The fix for the SPL-78590 has been released in 6.0.2+. Thank you.

rizzo75 · ‎12-07-2013

I am having the same problem. Did you ever find a solution?

Thanks,
Joe

rizzo75 · ‎01-03-2014

I opened a case with Splunk. Bug SPL-78590 has been logged regarding this issue. It seems like the field extraction is not taking into the consideration regarding CHARSET specified.

bowesmana · ‎12-07-2013

No Splunk solution, but I converted the files to utf-8, depending on your platform, I used

iconv -f utf16-le -t utf-8

to convert the file.

bowesmana · ‎10-15-2013

Yes, values are fine, just the header names are wrong.

gkanapathy · ‎10-15-2013

Just want to be certain, the field values are decoded okay, but the field names are not, is that correct?

help with CSV inputs - utf16-le and header input problems

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Splunk Community Badges!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

Join the Conversation