Hello all, I am having problems importing a file which is a .txt file but formatted as a CSV. The file has a fields line at the top of the file, but splunk does not seem to recognize the fields even when the field names option set to line and the line specified. The second issue is the field extractions, regardless of which options I specify in the advanced configuration lines 5 and 15 (6 and 16 in the file itself) keep adding additional fields due to how the line is structured, the last portion of the line has quotes and additional commas, splunk appears to be ignoring the quotes and using the commas within the quotes as fields instead of data from the log.
These are the fields it is extracting for some reason:
_time EXTRA_FIELD_12 EXTRA_FIELD_13 EXTRA_FIELD_14 timestamp x00c_x00o_x00m_x00p_x00u_x00t_x00e_x00r_x00 x00d_x00a_x00t_x00a_x00b_x00y_x00t_x00e_x00s_x00 x00d_x00a_x00t_x00a_x00c_x00o_x00d_x00e_x00 x00e_x00n_x00d_x00t_x00i_x00m_x00e_x00 x00e_x00x_x00e_x00c_x00u_x00t_x00i_x00o_x00n_x00i_x00d_x00 x00m_x00e_x00s_x00s_x00a_x00g_x00e_x00 x00o_x00p_x00e_x00r_x00a_x00t_x00o_x00r_x00 x00s_x00o_x00u_x00r_x00c_x00e_x00 x00s_x00o_x00u_x00r_x00c_x00e_x00i_x00d_x00 x00s_x00t_x00a_x00r_x00t_x00t_x00i_x00m_x00e_x00 xFF_xFE__x00F_x00i_x00e_x00l_x00d_x00s_x00__x00 _x00e_x00v_x00e_x00n_x00t_x00
1 Failed to parse timestamp. Defaulting to file modtime. 5/9/2018 none Servername 0x 0 5/21/2015 0:15 {0E08A328-B94D-41E1-8CE7-############} Beginning of package execution. DOMAIN\USERNAME_SVC RunSequential {0330C6EE-C757-4A46-A5F3-############} 5/21/2015 0:15 PackageStart
50:34.0
2 Failed to parse timestamp. Defaulting to file modtime. 5/9/2018 none Servername 0x 0 5/21/2015 0:15 {0E08A328-B94D-41E1-8CE7-############} (null) DOMAIN\USERNAME_SVC RunSequential {0330C6EE-C757-4A46-A5F3-############} 5/21/2015 0:15 OnPreExecute
50:34.0
3 Failed to parse timestamp. Defaulting to file modtime. 5/9/2018 none Servername 0x 0 5/21/2015 0:15 {0E08A328-B94D-41E1-8CE7-############} (null) DOMAIN\USERNAME_SVC Insert to applicationTracking - Package Start {3722af5e-7c40-4c91-9829-############} 5/21/2015 0:15 OnPreExecute
50:34.0
4 CSV StreamId: 0 has extra incorrect columns in certain fields. 5/9/2018 DbName2 Step St...". none Servername 0x 100 5/21/2015 0:15 {0E08A328-B94D-41E1-8CE7-############} Executing query "insert dbo.applicationTracking (DbName1 DOMAIN\USERNAME_SVC Insert to applicationTracking - Package Start {3722af5e-7c40-4c91-9829-############} 5/21/2015 0:15 OnProgress
Failed to parse timestamp. Defaulting to file modtime. 50:34.0
5 Failed to parse timestamp. Defaulting to file modtime. 5/9/2018 none Servername 0x 0 5/21/2015 0:15 {0E08A328-B94D-41E1-8CE7-############} (null) DOMAIN\USERNAME_SVC Insert to applicationTracking - Package Start {3722af5e-7c40-4c91-9829-############} 5/21/2015 0:15 OnPostExecute
50:34.0
Line 5: OnProgress,Servername,DOMAIN\USERNAME_SVC,Insert to applicationTracking - Package Start,{3722af5e-7c40-4c91-9829-############},{0E08A328-B94D-41E1-8CE7-############},5/21/2015 12:15:04 AM,5/21/2015 12:15:04 AM,100,0x,Executing query "insert dbo.applicationTracking (DbName1, DbName2, Step, St...".
The version of splunk we are using is: Splunk 6.5.3 (build 36937ad027d4)
Here is the file I am attempting to ingest, it has been obfuscated for obvious reasons, https://ufile.io/3p4qy it is a zip file.
MD5 BB8F2FE7D5BDA40D8A9FA221F16EBB55
SHA256 23EE6AF68817D8F62BA294C89887AE33930E55FD462C4C118EB1DF7CAD798073
I think I got everything, let me know if you guys need any additional details.
Than, you in advance!
Okay, I figured out the answer to the first question. The character set was not the default, it was UCS-2 LE BOM, so that fixed the first portion where Splunk wasn't recognizing the fields at the top, changing the charset resolved that issue.
The lines with the extra fields still persists.
Again, any help is appreciated.
Thank you