I'm using the most recent version of Splunk Light Forwarder to forward .csv files to my main Splunk server (4.2, build 96430). There are 30 files on each of 4 servers, and the files are updated with a few rows every minute.
I have noticed that after the services run for a few days, these csv files stop being indexed around the same time. In the most recent incident, the point where the files stop being indexed is around 26,000 rows when the files are 8.5MB in size. If there is a server that is not active and doesn't have as much data, it does not appear to be affected.
Looking at the main server splunkd.log shows something very weird - at a certain point Splunk decides that the timestamps, which are from less than a minute ago, are "outside of the acceptable time window":
12-07-2011 21:50:05.776 -0500 WARN DateParserVerbose - A possible timestamp match (Wed Dec 07 21:49:03 2011) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context="source::\servername\logs\timings16_0.csv|host::servername|summary_timings|remoteport::53957"
12-07-2011 21:50:05.776 -0500 WARN DateParserVerbose - A possible timestamp match (Wed Dec 07 21:49:42 2011) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context="source::\servername\logs\timings16_0.csv|host::servername|summary_timings|remoteport::53957"
and then shows thousands of "similar messages suppressed".
I have the following in props.conf, but I don't understand why Splunk suddenly decides these timestamps are out of range, when they clearly are not. And why it only seems to do this after the files have reached a certain size.
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 999999
If I restart services so that new .csv files are generated, they begin being indexed again.
Any idea what's going on here?
The problem is surely that you and Splunk disagree on the TZ applied to the events. You should have each forwarding server setup with NTP so that the clock cannot drift and you should use a search like this (especially around DST changes) to see if you have a possible TZ problem (0 < avg < 1000):
index=* | eval lagSecs=_indextime - _time | stats avg(lagSecs) by index,sourcetype,host
You fix it like this:
$SPLUNK_HOME/etc/apps/MyApp/default/props.conf:
[host::99999\.9999\.9999\.9999]
TZ = US/Pacific
You can track pre-fix/post-fix changes by examining date_zone
changes like this:
index=* | eval lagSecs=_indextime - _time | stats avg(lagSecs) by index,sourcetype,host,date_zone
Hi woodcook,
Can you please provide me a solution for this, here's our path
UFs------>HF------->splunk
we're seeing this warn messages on our HF
05-02-2016 16:36:21.908 -0500 WARN DateParserVerbose - Accepted time format has changed ((?i)(?<!\w|\d[:\.\-])(?i)(?<![\d\w])(jan|\x{3127}\x{6708}|feb|\x{4E8C}\x{6708}|mar|\x{4E09}\x{6708}|apr|\x{56DB}\x{6708}|may|\x{4E94}\x{6708}|jun|\x{516D}\x{6708}|jul|\x{4E03}\x{6708}|aug|\x{516B}\x{6708}|sep|\x{4E5D}\x{6708}|oct|\x{5341}\x{6708}|nov|\x{5341}\x{3127}\x{6708}|dec|\x{5341}\x{4E8C}\x{6708})[a-z,\.;]*([/\- ]) {0,2}(?i)(0?[1-9]|[12]\d|3[01])(?!:) {0,2}(?:\d\d:\d\d:\d\d(?:[\.\,]\d+)? {0,2}(?i)((?:(?:UT|UTC|GMT(?![+-])|CET|CEST|CETDST|MET|MEST|METDST|MEZ|MESZ|EET|EEST|EETDST|WET|WEST|WETDST|MSK|MSD|IST|JST|KST|HKT|AST|ADT|EST|EDT|CST|CDT|MST|MDT|PST|PDT|CAST|CADT|EAST|EADT|WAST|WADT|Z)|(?:GMT)?[+-]\d\d?:?(?:\d\d)?)(?!\w))?)?((?:\2|,) {0,2}(?i)(20\d\d|19\d\d|[901]\d(?!\d)))?(?!/|\w|\.\d)), possibly indicating a problem in extracting timestamps. Context: source::/opt/apps/miware/server/jvm01/log/server.log|host::ws97yelx|log4j|135220
05-02-2016 16:38:34.387 -0500 WARN DateParserVerbose - A possible timestamp match (Mon Jan 2 16:38:31 2017) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/opt/apps/mware-6.4/tst/jvm01/log/server.log|host::wn76yflx|log4j|3160
05-02-2016 16:38:34.387 -0500 WARN DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event (Thu Jan 7 16:38:31 2016). Context:source::/opt/apps/mware-6.4/tst/jvm01/log/server.log|host::wn76yflx|log4j|3160
I did run this search and found many of the hosts and sourcetypes are showing date_zone= -300
index=* NOT date_zone=local | eval lagSecs=_indextime - _time | stats avg(lagSecs) by index,sourcetype,host,date_zone