Getting Data In

failed to parse timestamp for apache error log

alextsui
Path Finder

I am monitoring the error.log of a apache server. A single error log file contains events from 2010 to 2011. Splunk is able to parse the timestamp for events from 2010 but failed to parse correctly for events from 2011. All of the events from 2011 got timestamped for 2010/12/31 21:50:47. I searched the _internal index looking for things related to error.log and found the following error messages:

01-25-2011 10:32:29.596 WARN  DateParserVerbose - Failed to parse timestamp for event.  Context="source::D:\XXXXXXXX\Apache\logs\error.log|host::XXXXXXXX01|apache_error|remoteport::2123" Text="[Tue Jan 04 08:56:53 2011] [warn] [client XX.XX.XX.XX] [4428] auth_ldap authenticate: user XXXXXX ..."

01-25-2011 10:32:29.596 WARN  DateParserVerbose - The TIME_FORMAT specified is matching timestamps (Tue Jan  4 08:56:53 2011) outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE.

After adding MAX_DAYS_HENCE = 30 to props.conf Splunk was able to parse the timestamp correctly.

The apache error log is been forwarded from a remote Splunk light forwarder. Below lists the configuration:

inputs.conf on the light forwarder

[monitor://D:\XXXXXXX\Apache\logs\error.log]
disabled = 0
followTail = 0
sourcetype = apache_error

props.conf on the indexer

[apache_error]
CHARSET = BIG5
TIME_PREFIX = \[
TIME_FORMAT = %a %b %d %T %Y
MAX_TIMESTAMP_LOOKAHEAD = 26
MAX_DAYS_HENCE = 30

The system times of the light forwarder and the indexer are within 1 minute of each other and is current. My question is the MAX_DAYS_HENCE is required to parse the timestamp?

(some information provided here has been modified and replaced with XX) Thanks.

Tags (1)
0 Karma

kristian_kolb
Ultra Champion

This is perhaps a bit old, but documentation indicates a few things that may be relevant (and a few that may not);

1) make sure that your TIME_FORMAT, TIME_PREFIX and MAX_TIMESTAMP_LOOKAHEAD are correct.

2) From the docs on timestamp assignment - if preceding steps to determine the event time fails: 5. For file sources, if no date can be identified in the file name, Splunk uses the file's modification time. See http://docs.splunk.com/Documentation/Splunk/latest/Data/HowSplunkextractstimestamps for more info. Your stated event timestamps seem to indicate that this is what happens.

3) CHARSET configuration should be in props.conf on the forwarder, NOT on the indexer, as per http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings . Failure to use the correct CHARSET may possibly cause your regexes to fail.

4) MAX_DAYS_HENCE should not play any part here.

5) on a side note - and this may not be relevant in your case, you might set the alwaysOpenFile parameter in inputs.conf to 1 on the forwarder. http://docs.splunk.com/Documentation/Splunk/latest/admin/inputsconf

Hope this helps,

Kristian

0 Karma

alextsui
Path Finder

I checked the error.log, and it does not receive regular stream of evetns. From what I can see, the files gets written irregularly at least once a day on different hours. I also notice one strange thing. The last modified time for the file got stuck on Dec 31, 2010 at 9:10am even when it's been written continuously up to now.

0 Karma

mitch_1
Splunk Employee
Splunk Employee

Very odd. I don't see anything about that configuration that would confuse the timestamping code.

In the original source file, is there a fairly regular stream of events, or are there large (multi-day gaps) perhaps? Trying to think of anything that might be relevant.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...