Hi,
I set up in Splunk 4.3 a file directory data input to index our log files which are generated in multiple time zones but stored in a single shared directory on a network drive. The file timestamps look OK, i.e. already normalized to our local time, when viewed in Windows File Explorer but Splunk carries out another normalization and, consequently, some log files have timestamps in the future.
Is there a way to disable Splunk's normalization of file timestamps for a selected data input?
Thanks,
Adam
You can't do it based on inputs, but you can based on sourcetype/host/source.
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition
UPDATED TO REFLECT COMMENT:
There are 2 problems here. The first one I see is that your time zone offset doesn't follow strptime specifications, which is what the error is telling you:
http://docs.python.org/library/datetime.html?highlight=strptime#strftime-behavior
GMT +5 isn't valid, while +0500 is valid.
Second, the strptime for the date format you mentioned appears to be incorrect, %n does not belong in the middle of the expression.
I think your props.conf stanza should be using
TIME_FORMAT = '%Y-%m-%d %H:%M:%S,%3N
#to stop us from looking at tz in timestamp
MAX_TIMESTAMP_LOOKAHEAD = 24
#http://en.wikipedia.org/wiki/List_of_zoneinfo_timezones for list of valid TZ's
TZ = YourTZ
I believe I do need to process the timezone information in each individual log message because different messages may come from different timezones (GMT -5, GMT +1, GMT +10, etc). Is there a way in Splunk to parse it?
Alternatively, going back to a solution using DATETIME_CONFIG = NONE, I would somehow need to tell Splunk to take a file modification date, but without the timezone information - perhaps your suggestion, TZ = YourTZ, would work here.
Regarding %n formatter, I used it to represent a white space. I followed the Unix strptime specification on
http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html
linked from
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition.
Perhaps your link (a Python specification) should be there instead.
Many thanks.
Sorry, here is the strptime format that I used:
%Y-%m-%d%n%H:%M:%S,%3N%n\(%Z\)
and the timestamp prefix:
^xxx\|x\.xx\.x\.xx\|xxxxxxx\|xxx\\xxxx\|
You can't do it based on inputs, but you can based on sourcetype/host/source.
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition
UPDATED TO REFLECT COMMENT:
There are 2 problems here. The first one I see is that your time zone offset doesn't follow strptime specifications, which is what the error is telling you:
http://docs.python.org/library/datetime.html?highlight=strptime#strftime-behavior
GMT +5 isn't valid, while +0500 is valid.
Second, the strptime for the date format you mentioned appears to be incorrect, %n does not belong in the middle of the expression.
I think your props.conf stanza should be using
TIME_FORMAT = '%Y-%m-%d %H:%M:%S,%3N
#to stop us from looking at tz in timestamp
MAX_TIMESTAMP_LOOKAHEAD = 24
#http://en.wikipedia.org/wiki/List_of_zoneinfo_timezones for list of valid TZ's
TZ = YourTZ
I tried DATETIME_CONFIG = NONE in the config file, but the result was the same as before. I believe Splunk did use file modification date/time but it probably also picked up the timezone information and applied the normalization, resulting in some timestamps being in the future.
I then tried to use the data input preview function and tell Splunk my date/time format but Splunk could not recognize it ("Could not use strptime to parse timestamp").
Here is my generic log file entry:
xxx|x.xx.x.xx|xxxxxxx|xxx\xxxx|2012-07-18 10:11:24,856 (GMT+5)|ERROR|x|xxx.xx.xxx.xxx.xxx:x|xx:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I entered the following timestamp format for strtime:
%Y-%m-%d%n%H:%M:%S,%3N%n(%Z)
I also specified the timestamp prefix pattern:
^xxx|x.xx.x.xx|xxxxxxx|xxx\xxxx|
Is there a time offset specified in the events? Splunk will use these if you haven't specified a timezone or if not configured to ignore it.