I have datasets in TSV format where there is no header in the file. I tried to use the wizard to import the data, base it on TSV, define the header and set the (long list) of headers. For some reason the custom headers were not accepted. Has someone a sample props.conf for a TSV file with a custom header that works? 😞
Maybe the header is too long, don't know.
Here are the header fields as comma separated list:
accept_language,browser,browser_height,browser_width,c_color,campaign,channel,click_action,click_action_type,click_context,click_context_type,click_sourceid,click_tag,code_ver,color,connection_type,cookies,country,ct_connect_type,curr_factor,curr_rate,currency,cust_hit_time_gmt,cust_visid,daily_visitor,date_time,domain,duplicate_events,duplicate_purchase,duplicated_from,evar1-250,event_list,exclude_hit,first_hit_page_url,first_hit_pagename,first_hit_referrer,first_hit_time_gmt,geo_city,geo_country,geo_dma,geo_region,geo_zip,hier1-5,hier3,hier4,hier5,hit_source,hit_time_gmt,hitid_high,hitid_low,homepage,hourly_visitor,ip,ip2,j_jscript,java_enabled,javascript,language,last_hit_time_gmt,last_purchase_num,last_purchase_time_gmt,mcvisid,mobile* post_mobile*,mobile_id,monthly_visitor,mvvar1-3,namespace,new_visit,os,p_plugins,page_event,page_event_var1,page_event_var2,page_event_var3,page_type,page_url,pagename,paid_search,partner_plugins,persistent_cookie,plugins,post_ page_event,post_ page_type,post_browser_height,post_browser_width,post_campaign,post_channel,post_cookies,post_currency,post_cust_hit_time_gmt,post_cust_visid,post_evar1-75,post_event_list,post_hier1-5,post_java_enabled,post_keywords,post_mvvar1-3,post_page_event_var1,post_page_event_var2,post_page_event_var3,post_page_url,post_pagename,post_pagename_no_url,post_partner_plugins,post_persistent_cookie,post_product_list,post_prop1-75,post_purchaseid,post_referrer,post_search_engine,post_state,post_survey,post_t_time_info,post_tnt,post_tnt_action,post_transactionid,post_visid_high,post_visid_low,post_visid_type,post_zip,prev_page,product_list,product_merchandising,prop1-75,purchaseid,quarterly_visitor,ref_domain,ref_type,referrer,resolution,s_resolution,sampled_hit,search_engine,search_page_num,secondary_hit,service,social*,post_social*,sourceid,state,stats_server,t_time_info,tnt,tnt_action,tnt_post_vista,transactionid,truncated_hit,ua_color,ua_os,ua_pixels,user_agent,user_hash,user_server,userid,username,va_closer_detail,va_closer_id,va_finder_detail,va_finder_id,va_instance_event,va_new_engagement,video*,post_video*,visid_high,visid_low,visid_new,visid_timestamp,visid_type,visit_keywords,visit_num,visit_page_num,visit_referrer,visit_search_engine,visit_start_page_url,visit_start_pagename,visit_start_time_gmt,weekly_visitor,yearly_visitor,zip
there were several things off...
there are differences in the onboarding and in this case the one in "Data inputs » Files & directories" worked better then the one available from the "Data inputs" dialog.
the extra "," in header delimiter stems from trying to get the headers to match (and looking in the wrong place i.e. suspecting an issue with long lists of headers etc) and is unnecessary...
the headers as documented online are not matching but rather change from time to time and are delivered in a separate .tsv file (rejoice, rejoice)
the preview of the data import failing to reflect the way data is after import (i.e. correct ... )
there were several things off...
there are differences in the onboarding and in this case the one in "Data inputs » Files & directories" worked better then the one available from the "Data inputs" dialog.
the extra "," in header delimiter stems from trying to get the headers to match (and looking in the wrong place i.e. suspecting an issue with long lists of headers etc) and is unnecessary...
the headers as documented online are not matching but rather change from time to time and are delivered in a separate .tsv file (rejoice, rejoice)
the preview of the data import failing to reflect the way data is after import (i.e. correct ... )
Your configuration files look fine but I would keep only the FIELD_NAMES
and INDEXED_EXTRACTIONS = TSV
lines (change tsv
to TSV
) and remove everything else. Then double-check this list:
mysourcetype
exactly (casing, punctuation, etc.).props.conf
and transforms.conf
configuration files are deployed to the Indexers or Heavy Forwarders (or Universal Forwarders in some cases, such as INDEXED_EXTRACTIONS = TSV
).inputs.conf
configuration file is deployed to the Forwarder.yeah, there were several things off...
It seems to work nicely with the settings... how do i close this issue when no reply is quite correct 🙂
Answer your own questions and then click "Accept" on it.
What is in your configuration files right now?
[mysourcetype]
FIELD_DELIMITER = tab
FIELD_NAMES = accept_language,browser,browser_height,browser_width,c_color,campaign,channel,click_action,click_action_type,click_context,click_context_type,click_sourceid,click_tag,code_ver,color,connection_type,cookies,country,ct_connect_type,curr_factor,curr_rate,currency,cust_hit_time_gmt,cust_visid,daily_visitor,date_time,domain,duplicate_events,duplicate_purchase,duplicated_from,evar1-250,event_list,exclude_hit,first_hit_page_url,first_hit_pagename,first_hit_referrer,first_hit_time_gmt,geo_city,geo_country,geo_dma,geo_region,geo_zip,hier1-5,hier3,hier4,hier5,hit_source,hit_time_gmt,hitid_high,hitid_low,homepage,hourly_visitor,ip,ip2,j_jscript,java_enabled,javascript,language,last_hit_time_gmt,last_purchase_num,last_purchase_time_gmt,mcvisid,mobile* post_mobile*,mobile_id,monthly_visitor,mvvar1-3,namespace,new_visit,os,p_plugins,page_event,page_event_var1,page_event_var2,page_event_var3,page_type,page_url,pagename,paid_search,partner_plugins,persistent_cookie,plugins,post_ page_event,post_ page_type,post_browser_height,post_browser_width,post_campaign,post_channel,post_cookies,post_currency,post_cust_hit_time_gmt,post_cust_visid,post_evar1-75,post_event_list,post_hier1-5,post_java_enabled,post_keywords,post_mvvar1-3,post_page_event_var1,post_page_event_var2,post_page_event_var3,post_page_url,post_pagename,post_pagename_no_url,post_partner_plugins,post_persistent_cookie,post_product_list,post_prop1-75,post_purchaseid,post_referrer,post_search_engine,post_state,post_survey,post_t_time_info,post_tnt,post_tnt_action,post_transactionid,post_visid_high,post_visid_low,post_visid_type,post_zip,prev_page,product_list,product_merchandising,prop1-75,purchaseid,quarterly_visitor,ref_domain,ref_type,referrer,resolution,s_resolution,sampled_hit,search_engine,search_page_num,secondary_hit,service,social*,post_social*,sourceid,state,stats_server,t_time_info,tnt,tnt_action,tnt_post_vista,transactionid,truncated_hit,ua_color,ua_os,ua_pixels,user_agent,user_hash,user_server,userid,username,va_closer_detail,va_closer_id,va_finder_detail,va_finder_id,va_instance_event,va_new_engagement,video*,post_video*,visid_high,visid_low,visid_new,visid_timestamp,visid_type,visit_keywords,visit_num,visit_page_num,visit_referrer,visit_search_engine,visit_start_page_url,visit_start_pagename,visit_start_time_gmt,weekly_visitor,yearly_visitor,zip
HEADER_FIELD_DELIMITER = tab
INDEXED_EXTRACTIONS = tsv
disabled = false
hm, the preview and guidance when using the older import wizard seems to fail me, the indexed data looks fine 😕
Why do you set HEADER_FIELD_DELIMITER
if there is no header in your file?