I've got a situation where different date elements are providing inconsistent results for the same time data. I suspect this is a result of index time vs. search time processing and timezone differences between the data presented and data indexed. The data contains a date in UTC, which is converted to MDT by Splunk. The date_* data is incorrect when compared to the _time field, though is correct for the raw timestamp data.
If this activity were consistent amongst all system provided time fields, I could live with it pretty easily, but the fact that the data presented in the time field differs from the date* fields is problematic. Is this a bug or expected behavior?
Here's info on the search, data returned, props config, and splunk version
Search modifiers:
| eval wd=lower(strftime(time,"%A")) | table _time, date*, wd
Raw time data:
2013-08-16T05:10:05
Data presented:
_time==8/15/13 11:10:05.000 PM
wd==thursday
date_wday==friday
date_mday==16
date_hour==5
props.conf config:
[my_sourcetype]
KV_MODE = json
MAX_TIMESTAMP_LOOKAHEAD = 500
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 100000
TIME_FORMAT = %Y-%m-%dT%H:%M:%S
TIME_PREFIX = ,\"start\":\"
TZ = UTC
pulldown_type = 1
Splunk version:
5.0.4, build 172409
This is expected behavior and is described in the manual here. The date*
fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.
_time
is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time
field in Splunk, even if the input stream has no timestamp at all.
If you want to do accurate datetime arithmetic, IMO you should never use the date*
fields. Instead, you can extract the necessary information from _time
, as you did in your example search. For example
yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay
You can also use the relative_time
function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:
yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth
I think your macro needs to have:
eval date_wday = strftime(_time, "%A")
for date_wday (capital "A") to work correctly
Yes, "%a" gives the DOW abbreviation, not the full name of the DOW.
This is expected behavior and is described in the manual here. The date*
fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.
_time
is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time
field in Splunk, even if the input stream has no timestamp at all.
If you want to do accurate datetime arithmetic, IMO you should never use the date*
fields. Instead, you can extract the necessary information from _time
, as you did in your example search. For example
yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay
You can also use the relative_time
function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:
yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth
You could create a macro to help your user base. I'd call the macro something like normalize_datetime
and have it calculate/override the existing date*
fields. This macro could also be used on data that has no date*
fields, providing an added benefit.
eval date_hour = strftime(_time, "%H") | eval date_mday = strftime(_time, "%d") | eval date_minute = strftime(_time, "%M") | eval date_month = strftime(_time, "%m") | eval date_second = strftime(_time, "%S") | eval date_wday = strftime(_time, "%a") | eval date_year = strftime(_time, "%y") | eval date_zone = "UTC"
Update: you may find that these fields do exist for inputs that do not use text timestamps. So far, I have not figured out how to remove the date_*
fields from automatic field extraction - and I have looked... I completely understand the comment "some of my user base will have a harder time"
Fair enough, at least it's documented. Not a big deal for me to stop using the date_* fields, but some of my user base will have a harder time.
Thank you for a solid explanation.