Archive

Splunk exhibiting weird behavior while importing a CSV

Engager

Hello all, I am having problems importing a file which is a .txt file but formatted as a CSV. The file has a fields line at the top of the file, but splunk does not seem to recognize the fields even when the field names option set to line and the line specified. The second issue is the field extractions, regardless of which options I specify in the advanced configuration lines 5 and 15 (6 and 16 in the file itself) keep adding additional fields due to how the line is structured, the last portion of the line has quotes and additional commas, splunk appears to be ignoring the quotes and using the commas within the quotes as fields instead of data from the log.

These are the fields it is extracting for some reason:

        _time   EXTRA_FIELD_12  EXTRA_FIELD_13  EXTRA_FIELD_14  timestamp   x00c_x00o_x00m_x00p_x00u_x00t_x00e_x00r_x00 x00d_x00a_x00t_x00a_x00b_x00y_x00t_x00e_x00s_x00    x00d_x00a_x00t_x00a_x00c_x00o_x00d_x00e_x00 x00e_x00n_x00d_x00t_x00i_x00m_x00e_x00  x00e_x00x_x00e_x00c_x00u_x00t_x00i_x00o_x00n_x00i_x00d_x00  x00m_x00e_x00s_x00s_x00a_x00g_x00e_x00  x00o_x00p_x00e_x00r_x00a_x00t_x00o_x00r_x00 x00s_x00o_x00u_x00r_x00c_x00e_x00   x00s_x00o_x00u_x00r_x00c_x00e_x00i_x00d_x00 x00s_x00t_x00a_x00r_x00t_x00t_x00i_x00m_x00e_x00    xFF_xFE__x00F_x00i_x00e_x00l_x00d_x00s_x00__x00 _x00e_x00v_x00e_x00n_x00t_x00
1   Failed to parse timestamp. Defaulting to file modtime.  5/9/2018                none    Servername  0x  0   5/21/2015 0:15  {0E08A328-B94D-41E1-8CE7-############}  Beginning of package execution. DOMAIN\USERNAME_SVC RunSequential   {0330C6EE-C757-4A46-A5F3-############}  5/21/2015 0:15  PackageStart
        50:34.0                                                         
2   Failed to parse timestamp. Defaulting to file modtime.  5/9/2018                none    Servername  0x  0   5/21/2015 0:15  {0E08A328-B94D-41E1-8CE7-############}  (null)  DOMAIN\USERNAME_SVC RunSequential   {0330C6EE-C757-4A46-A5F3-############}  5/21/2015 0:15  OnPreExecute
        50:34.0                                                         
3   Failed to parse timestamp. Defaulting to file modtime.  5/9/2018                none    Servername  0x  0   5/21/2015 0:15  {0E08A328-B94D-41E1-8CE7-############}  (null)  DOMAIN\USERNAME_SVC Insert to applicationTracking - Package Start   {3722af5e-7c40-4c91-9829-############}  5/21/2015 0:15  OnPreExecute
        50:34.0                                                         
4   CSV StreamId: 0 has extra incorrect columns in certain fields.  5/9/2018    DbName2 Step    St...". none    Servername  0x  100 5/21/2015 0:15  {0E08A328-B94D-41E1-8CE7-############}  Executing query "insert dbo.applicationTracking (DbName1    DOMAIN\USERNAME_SVC Insert to applicationTracking - Package Start   {3722af5e-7c40-4c91-9829-############}  5/21/2015 0:15  OnProgress
    Failed to parse timestamp. Defaulting to file modtime.  50:34.0                                                         
5   Failed to parse timestamp. Defaulting to file modtime.  5/9/2018                none    Servername  0x  0   5/21/2015 0:15  {0E08A328-B94D-41E1-8CE7-############}  (null)  DOMAIN\USERNAME_SVC Insert to applicationTracking - Package Start   {3722af5e-7c40-4c91-9829-############}  5/21/2015 0:15  OnPostExecute
        50:34.0                                                         

Line 5: OnProgress,Servername,DOMAIN\USERNAME_SVC,Insert to applicationTracking - Package Start,{3722af5e-7c40-4c91-9829-############},{0E08A328-B94D-41E1-8CE7-############},5/21/2015 12:15:04 AM,5/21/2015 12:15:04 AM,100,0x,Executing query "insert dbo.applicationTracking (DbName1, DbName2, Step, St...".

The version of splunk we are using is: Splunk 6.5.3 (build 36937ad027d4)

Here is the file I am attempting to ingest, it has been obfuscated for obvious reasons, https://ufile.io/3p4qy it is a zip file.

MD5 BB8F2FE7D5BDA40D8A9FA221F16EBB55
SHA256 23EE6AF68817D8F62BA294C89887AE33930E55FD462C4C118EB1DF7CAD798073

I think I got everything, let me know if you guys need any additional details.
Than, you in advance!

Tags (1)

Engager

Okay, I figured out the answer to the first question. The character set was not the default, it was UCS-2 LE BOM, so that fixed the first portion where Splunk wasn't recognizing the fields at the top, changing the charset resolved that issue.

The lines with the extra fields still persists.

Again, any help is appreciated.

Thank you

0 Karma