I am monitoring the error.log of a apache server. A single error log file contains events from 2010 to 2011. Splunk is able to parse the timestamp for events from 2010 but failed to parse correctly for events from 2011. All of the events from 2011 got timestamped for 2010/12/31 21:50:47. I searched the _internal index looking for things related to error.log and found the following error messages:
01-25-2011 10:32:29.596 WARN DateParserVerbose - Failed to parse timestamp for event. Context="source::D:\XXXXXXXX\Apache\logs\error.log|host::XXXXXXXX01|apache_error|remoteport::2123" Text="[Tue Jan 04 08:56:53 2011] [warn] [client XX.XX.XX.XX]  auth_ldap authenticate: user XXXXXX ..." 01-25-2011 10:32:29.596 WARN DateParserVerbose - The TIME_FORMAT specified is matching timestamps (Tue Jan 4 08:56:53 2011) outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE.
After adding MAX_DAYS_HENCE = 30 to props.conf Splunk was able to parse the timestamp correctly.
The apache error log is been forwarded from a remote Splunk light forwarder. Below lists the configuration:
inputs.conf on the light forwarder
[monitor://D:\XXXXXXX\Apache\logs\error.log] disabled = 0 followTail = 0 sourcetype = apache_error
props.conf on the indexer
[apache_error] CHARSET = BIG5 TIME_PREFIX = \[ TIME_FORMAT = %a %b %d %T %Y MAX_TIMESTAMP_LOOKAHEAD = 26 MAX_DAYS_HENCE = 30
The system times of the light forwarder and the indexer are within 1 minute of each other and is current. My question is the MAX_DAYS_HENCE is required to parse the timestamp?
(some information provided here has been modified and replaced with XX) Thanks.
This is perhaps a bit old, but documentation indicates a few things that may be relevant (and a few that may not);
1) make sure that your
MAX_TIMESTAMP_LOOKAHEAD are correct.
2) From the docs on timestamp assignment - if preceding steps to determine the event time fails:
5. For file sources, if no date can be identified in the file name, Splunk uses the file's modification time. See http://docs.splunk.com/Documentation/Splunk/latest/Data/HowSplunkextractstimestamps for more info. Your stated event timestamps seem to indicate that this is what happens.
CHARSET configuration should be in props.conf on the forwarder, NOT on the indexer, as per http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings . Failure to use the correct
CHARSET may possibly cause your regexes to fail.
MAX_DAYS_HENCE should not play any part here.
5) on a side note - and this may not be relevant in your case, you might set the
alwaysOpenFile parameter in inputs.conf to
1 on the forwarder. http://docs.splunk.com/Documentation/Splunk/latest/admin/inputsconf
Hope this helps,
I checked the error.log, and it does not receive regular stream of evetns. From what I can see, the files gets written irregularly at least once a day on different hours. I also notice one strange thing. The last modified time for the file got stuck on Dec 31, 2010 at 9:10am even when it's been written continuously up to now.
Very odd. I don't see anything about that configuration that would confuse the timestamping code.
In the original source file, is there a fairly regular stream of events, or are there large (multi-day gaps) perhaps? Trying to think of anything that might be relevant.