<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246549#M47591</link>
    <description>&lt;P&gt;if you look at the Time stamp outside of the event - 3/15/16 8:12.00 AM&lt;/P&gt;

&lt;P&gt;Supposed to be 12/15/16 8:12.00 AM as viewed in the event itself.&lt;/P&gt;</description>
    <pubDate>Thu, 19 Jan 2017 17:08:03 GMT</pubDate>
    <dc:creator>greenwood1972</dc:creator>
    <dc:date>2017-01-19T17:08:03Z</dc:date>
    <item>
      <title>Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246546#M47588</link>
      <description>&lt;P&gt;I am monitoring a directory on the search head server that contains a group of CSV's that are being imported into Splunk. I have setup an app for this import type with inputs, props and transforms (code pasted below) and it works for 9,997 out 10,012 line items in an example CSV - yes the CSV does contain header fields that I have extracted, however the CSV header row is not consistent from file to file -  the time field "Response Time" is however and is what I need the data to be time stamped on.&lt;/P&gt;

&lt;P&gt;The issue I am running across is that a few events are not picking up the correct MONTH even though I have defined what field to extract the time from for the event. I cannot find any pattern or reason for the few odd ball events to be showing a different month than the "Response Time" defined in the transforms.conf file.&lt;/P&gt;

&lt;P&gt;I might be missing something or have things turned around but at a loss of as now. &lt;/P&gt;

&lt;P&gt;inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor:///opt/data/web_marketauto]
index = scratch
sourcetype = acton:website:data
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[acton:website:data]
INDEXED_EXTRACTIONS = csv
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
CHECK_FOR_HEADER = True
CATEGORY = Structured
DESCRIPTION = Website data from Act-On
PULLDOWN_TYPE = true
HEADER_FIELD_LINE_NUMBER = 1
TRANSFORMS-webfieldtransform = web_field_trans
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[web_field_trans]
TIME_FIELD = Response Time
TIME_FORMAT = %b %d &amp;amp;Y %H:%M %Z
DELIMS = ,
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Screenshot of the indexed oddball event:&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Screenshot of the indexed oddball event:"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/2350iB2F4337E52897029/image-size/large?v=v2&amp;amp;px=999" role="button" title="Screenshot of the indexed oddball event:" alt="Screenshot of the indexed oddball event:" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Jan 2017 22:30:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246546#M47588</guid>
      <dc:creator>greenwood1972</dc:creator>
      <dc:date>2017-01-18T22:30:25Z</dc:date>
    </item>
    <item>
      <title>Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246547#M47589</link>
      <description>&lt;P&gt;There's nothing incorrect I can see in that image.  &lt;/P&gt;

&lt;P&gt;It extracted Dec 15 2016 8:12 AM MST to Dec 15 2016 8:12 AM MST.&lt;/P&gt;

&lt;P&gt;Is there a MONTH field, not shown, that is incorrect?&lt;/P&gt;

&lt;P&gt;Or is there supposed to be a MONTH field that is missing?&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jan 2017 01:27:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246547#M47589</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-01-19T01:27:38Z</dc:date>
    </item>
    <item>
      <title>Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246548#M47590</link>
      <description>&lt;P&gt;From your data I can see Your timestamp should be like this&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIME_FORMAT = %b %d %Y %I:%M %p %Z
Dec 15 2016 8:12 AM MST
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Maybe this is causing you problems&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jan 2017 04:13:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246548#M47590</guid>
      <dc:creator>nabeel652</dc:creator>
      <dc:date>2017-01-19T04:13:55Z</dc:date>
    </item>
    <item>
      <title>Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246549#M47591</link>
      <description>&lt;P&gt;if you look at the Time stamp outside of the event - 3/15/16 8:12.00 AM&lt;/P&gt;

&lt;P&gt;Supposed to be 12/15/16 8:12.00 AM as viewed in the event itself.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jan 2017 17:08:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246549#M47591</guid>
      <dc:creator>greenwood1972</dc:creator>
      <dc:date>2017-01-19T17:08:03Z</dc:date>
    </item>
    <item>
      <title>Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246550#M47592</link>
      <description>&lt;P&gt;Thanks for the sanity check, implemented the time format change - missed the 24 to 12 hour strftime formatting in my original TIME_FORMAT but the problem still exists. &lt;/P&gt;</description>
      <pubDate>Thu, 19 Jan 2017 17:23:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246550#M47592</guid>
      <dc:creator>greenwood1972</dc:creator>
      <dc:date>2017-01-19T17:23:21Z</dc:date>
    </item>
    <item>
      <title>Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246551#M47593</link>
      <description>&lt;P&gt;Added some reference points in this image for time extraction issue&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Added some reference points in this image for time extraction"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/2349iA3D809DD5714C306/image-size/large?v=v2&amp;amp;px=999" role="button" title="Added some reference points in this image for time extraction" alt="Added some reference points in this image for time extraction" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jan 2017 17:31:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246551#M47593</guid>
      <dc:creator>greenwood1972</dc:creator>
      <dc:date>2017-01-19T17:31:22Z</dc:date>
    </item>
    <item>
      <title>Re: Why does importing CSV files from a server directory using transforms.conf results in random month being inserted into event time?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246552#M47594</link>
      <description>&lt;P&gt;I reworked my .conf files and in doing so solved my issue (and a few others) - instead of letting Splunk define my fields and timestamp I am using TIME_PREFIX, REGEX, FIELDS, TRANSFORMS and REPORT to explicitly tell Splunk what to do. Everything works great now and it gives me more flexibility to define/rename new fields via REPORT if the header names ever change or new ones added in the future.&lt;/P&gt;

&lt;P&gt;inputs.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor:///opt/data/web_marketauto/...]
host = XX.XXX.XX.XXX
index = scratch
sourcetype = acton:website:data
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;UL&gt;
&lt;LI&gt;Monitoring a directory for all files&lt;/LI&gt;
&lt;LI&gt;Defining host since the host is not included in the source files&lt;/LI&gt;
&lt;LI&gt;Dumping to a test "scratch" index for testing&lt;/LI&gt;
&lt;LI&gt;Defining sourcetype&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[acton:website:data]
description = Website data from Act-On
KV_MODE = none
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
TIME_PREFIX= ^[^\w]?^[^\d]+\,
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %b %d %Y %I:%M %p %Z
TZ = MST
TRANSFORMS-t2 = delete_web_headers
REPORT-acton_website_fields = acton_website_fields
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;UL&gt;
&lt;LI&gt;Sourcetype as defined in inputs.conf&lt;/LI&gt;
&lt;LI&gt;Description of sourcetype&lt;/LI&gt;
&lt;LI&gt;"KV_MODE = none" is forcing Splunk to not autogenerate fields that might over-ride my defined fields in transforms.conf&lt;/LI&gt;
&lt;LI&gt;no line merging&lt;/LI&gt;
&lt;LI&gt;no binary check&lt;/LI&gt;
&lt;LI&gt;Defining where to find my timestamp that Splunk should used at index time&lt;/LI&gt;
&lt;LI&gt;How far ahead to look for the end of the timestamp from the end of my regex&lt;/LI&gt;
&lt;LI&gt;Defining the time stamp format in my
source file&lt;/LI&gt;
&lt;LI&gt;What time zone my source file was written from&lt;/LI&gt;
&lt;LI&gt;Defining a index time TRANSFORMS that will eliminate the header row from my source file&lt;/LI&gt;
&lt;LI&gt;Defining a search time REPORT that will define the fields to use&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[delete_web_headers]
REGEX = ^(Name\s\,.*)
DEST_KEY = queue
FORMAT = nullQueue

[acton_website_fields]
DELIMS = ","
FIELDS = name,email,response_time,visited_page,ip_address,visitor_company,visitor_locations,referrer,search_engine,search_query,browser,event_name,event_start_date,user_type,first_name,last_name,email_address,registered,attended,duration,title,company,phone,address_1,address_2,city,state,post_code,country,company,job_title,business_phone,currently_use_splunk,company_splunk_use,upcoming_splunk_project,splunk_experience_level,topic,source_campaign,form,campaign,campaign_id,ip_address_2,browser_2,referrer_2,search,geo_company_name,geo_country_code,geo_country,geo_state,geo_city,geo_postal_code,page_url,email_3,description,lead_source,heard_about_gtri,email_4,middle_name,department,business_street,business_city,business_state,business_postal_code,business_country,business_fax,cell_phone,business_website,personal_website,account_name,full_name,mobile_phone,modified_on,ownder,parent_account,website,all_job_functions,linkedin_profile,twitter_profile,facebook_profile,gtri_splunk_services,additional_details,comments,address_id,splunk_experience,splunk_use_case
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;[delete_web_headers]    &lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;REGEX that captures the first row in my source file &lt;/LI&gt;
&lt;LI&gt;DEST_KEY tells Splunk to move the first row defined by the REGEX &lt;/LI&gt;
&lt;LI&gt;FORMAT tells Splunk to dump the first row of data into null at index time&lt;/LI&gt;
&lt;LI&gt;This eliminates capturing the first row from my source file&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;[acton_website_fields]&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;DELIMS tells Splunk what the delimiterwill be when I define my fields&lt;/LI&gt;
&lt;LI&gt;FIELDS are the manually assigned fields that I defined for every row of my source file(s)&lt;/LI&gt;
&lt;LI&gt;This allows me a ton of options in order to normalize the field names atsearch time&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;I have gone a bit further in indexing other sources and defined individual fields on a row by row basis based on certain attributes of data in that row using multiple REPORTS - helps when groups of rows contain different fields.&lt;/P&gt;

&lt;P&gt;Hopefully this helps someone out, hate it when solutions are not posted &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 12:50:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-does-importing-CSV-files-from-a-server-directory-using/m-p/246552#M47594</guid>
      <dc:creator>greenwood1972</dc:creator>
      <dc:date>2020-09-29T12:50:05Z</dc:date>
    </item>
  </channel>
</rss>

