<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Setting date on event based on filename in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50240#M9549</link>
    <description>&lt;H2&gt;What is going on here?&lt;/H2&gt;

&lt;P&gt;This occurs because the time stamp of the events in your source file is incomplete (has a time, doesn't have a date) which leaves Splunk to do some amount of guessing as to what date to assign to the new event.&lt;/P&gt;

&lt;P&gt;In your case, I am fairly certain that the following occurs :&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Splunk discovers the file "VMail-110212-000000.Log" and reads the first event which contains the partial time stamp "23:59:57.334".&lt;/LI&gt;
&lt;LI&gt;From this partial time stamp, Splunk deduces that the date is February 12th 2011 (using the file name and your custom datetime.xml regex) and that the time is "23:59:57.334".&lt;/LI&gt;
&lt;LI&gt;If the next event comes in with a time of "00:01:12.232" for example, Splunk will deduce that we have moved to the next day and will time stamp that event with as "February 13th 2011 00:01:12.232"&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;As a result, Splunk will index the file as if it spanned over several days. Note that this will also occur if there is a discontinuity in the time of subsequent events. For example, an event recorded with a time of "10:28:59" following an event with a time of "10:29:12" will also trigger an increase of 1 for the value of date_mday (the internal field where Splunk stores the value of the day of the month).&lt;/P&gt;

&lt;H2&gt;How to prevent this?&lt;/H2&gt;

&lt;P&gt;As of 4.1.x, there is no way to prevent this from happening when you are indexing historical data (i.e files that contain events not from the current day).&lt;/P&gt;

&lt;P&gt;If you are indexing current events, you can add the parameter "MAX_DAYS_HENCE = 0" to the props.conf stanza for this source/sourcetype to prevent Splunk from increasing the value of date_mday beyond the current day.&lt;/P&gt;

&lt;P&gt;In 4.2, Splunk will honor the MAX_DIFF_SECS_AGO parameter even for incomplete time stamps (which is not the case in 4.1.x) and you will be able to use that parameter to prevent date_mday increases for both current and historical files. From props.conf.spec :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;MAX_DIFF_SECS_AGO = &amp;lt;integer&amp;gt;
* If the event's timestamp is more than &amp;lt;integer&amp;gt; seconds BEFORE the previous timestamp, only accept it if it has the same exact time format as the majority of timestamps from the source.
* IMPORTANT: If your timestamps are wildly out of order, consider increasing this value.
* Defaults to 3600 (one hour).
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Fri, 25 Feb 2011 05:39:42 GMT</pubDate>
    <dc:creator>hexx</dc:creator>
    <dc:date>2011-02-25T05:39:42Z</dc:date>
    <item>
      <title>Setting date on event based on filename</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50239#M9548</link>
      <description>&lt;P&gt;I'm trying to load one of my logs from my phone server into Splunk. Splunk will read the log file and break the events correctly. &lt;/P&gt;

&lt;P&gt;My problem is with the date that Splunk places on the event. The event in my log has a time but no date however the date is in the filename. When I look at the loaded events in Splunk the time is extracted properly however the date is the next day (which is because of the modified date on the file).&lt;/P&gt;

&lt;P&gt;I've copied the existing datetime.xml into /etc/system/local and added an extraction for my date from my filename. How can I pull the date from the filename into Splunk as the date on my event?&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;File name:&lt;/STRONG&gt; d:...\VMail-110212-000000.Log&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Raw event looks like from (d:...\VMail-110212-000000.Log but shows date as 2/13/11 because of file modified date):&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;23:59:57.334 ( 4172: 5540) [MS] Entering Process MWI loop
23:59:57.334 ( 4172: 5540) [MS] Process MWI loop return, no more entries left&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Datetime.xml:&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;datetime&amp;gt;
    &amp;lt;define name="_masheddate3" extract="year, month, day"&amp;gt;
    &amp;lt;text&amp;gt;&amp;lt;![CDATA[source::.*?\\Vmail-\d{2}\d{2}\d{2}.*\.Log]]&amp;gt;&amp;lt;/text&amp;gt;
    &amp;lt;/define&amp;gt;
    ...

    &amp;lt;datepatterns&amp;gt;
    ...
        &amp;lt;use name="_masheddate3"/&amp;gt;
    &amp;lt;/datepatterns&amp;gt;

&amp;lt;/datetime&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;Props.conf&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[ShoreTel_VMail]
SHOULD_LINEMERGE = FALSE
DATETIME_CONFIG = \etc\system\local\datetime.xml
TRANSFORMS-shoretel-comment = shoretel_comment
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;Inputs.conf&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor://D:\Sample Logs\Shoretel_Current]
disabled=false
host=Shoretel
sourcetype=ShoreTel_VMail
crcSalt=&amp;lt;SOURCE&amp;gt;
index=shoretel
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 25 Feb 2011 04:17:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50239#M9548</guid>
      <dc:creator>snowmizer</dc:creator>
      <dc:date>2011-02-25T04:17:14Z</dc:date>
    </item>
    <item>
      <title>Re: Setting date on event based on filename</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50240#M9549</link>
      <description>&lt;H2&gt;What is going on here?&lt;/H2&gt;

&lt;P&gt;This occurs because the time stamp of the events in your source file is incomplete (has a time, doesn't have a date) which leaves Splunk to do some amount of guessing as to what date to assign to the new event.&lt;/P&gt;

&lt;P&gt;In your case, I am fairly certain that the following occurs :&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Splunk discovers the file "VMail-110212-000000.Log" and reads the first event which contains the partial time stamp "23:59:57.334".&lt;/LI&gt;
&lt;LI&gt;From this partial time stamp, Splunk deduces that the date is February 12th 2011 (using the file name and your custom datetime.xml regex) and that the time is "23:59:57.334".&lt;/LI&gt;
&lt;LI&gt;If the next event comes in with a time of "00:01:12.232" for example, Splunk will deduce that we have moved to the next day and will time stamp that event with as "February 13th 2011 00:01:12.232"&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;As a result, Splunk will index the file as if it spanned over several days. Note that this will also occur if there is a discontinuity in the time of subsequent events. For example, an event recorded with a time of "10:28:59" following an event with a time of "10:29:12" will also trigger an increase of 1 for the value of date_mday (the internal field where Splunk stores the value of the day of the month).&lt;/P&gt;

&lt;H2&gt;How to prevent this?&lt;/H2&gt;

&lt;P&gt;As of 4.1.x, there is no way to prevent this from happening when you are indexing historical data (i.e files that contain events not from the current day).&lt;/P&gt;

&lt;P&gt;If you are indexing current events, you can add the parameter "MAX_DAYS_HENCE = 0" to the props.conf stanza for this source/sourcetype to prevent Splunk from increasing the value of date_mday beyond the current day.&lt;/P&gt;

&lt;P&gt;In 4.2, Splunk will honor the MAX_DIFF_SECS_AGO parameter even for incomplete time stamps (which is not the case in 4.1.x) and you will be able to use that parameter to prevent date_mday increases for both current and historical files. From props.conf.spec :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;MAX_DIFF_SECS_AGO = &amp;lt;integer&amp;gt;
* If the event's timestamp is more than &amp;lt;integer&amp;gt; seconds BEFORE the previous timestamp, only accept it if it has the same exact time format as the majority of timestamps from the source.
* IMPORTANT: If your timestamps are wildly out of order, consider increasing this value.
* Defaults to 3600 (one hour).
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 25 Feb 2011 05:39:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50240#M9549</guid>
      <dc:creator>hexx</dc:creator>
      <dc:date>2011-02-25T05:39:42Z</dc:date>
    </item>
    <item>
      <title>Re: Setting date on event based on filename</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50241#M9550</link>
      <description>&lt;P&gt;Thanks for the explanation and help hexx.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Feb 2011 23:02:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Setting-date-on-event-based-on-filename/m-p/50241#M9550</guid>
      <dc:creator>snowmizer</dc:creator>
      <dc:date>2011-02-25T23:02:46Z</dc:date>
    </item>
  </channel>
</rss>

