Getting Data In

variance betweeen _time and date_* fields

kbrady
Explorer

I've got a situation where different date elements are providing inconsistent results for the same time data. I suspect this is a result of index time vs. search time processing and timezone differences between the data presented and data indexed. The data contains a date in UTC, which is converted to MDT by Splunk. The date_* data is incorrect when compared to the _time field, though is correct for the raw timestamp data.

If this activity were consistent amongst all system provided time fields, I could live with it pretty easily, but the fact that the data presented in the time field differs from the date* fields is problematic. Is this a bug or expected behavior?

Here's info on the search, data returned, props config, and splunk version

Search modifiers:
| eval wd=lower(strftime(time,"%A")) | table _time, date*, wd

Raw time data:
2013-08-16T05:10:05

Data presented:
_time==8/15/13 11:10:05.000 PM
wd==thursday
date_wday==friday
date_mday==16
date_hour==5

props.conf config:
[my_sourcetype]
KV_MODE = json
MAX_TIMESTAMP_LOOKAHEAD = 500
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 100000
TIME_FORMAT = %Y-%m-%dT%H:%M:%S
TIME_PREFIX = ,\"start\":\"
TZ = UTC
pulldown_type = 1

Splunk version:
5.0.4, build 172409

1 Solution

lguinn2
Legend

This is expected behavior and is described in the manual here. The date* fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.

_time is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time field in Splunk, even if the input stream has no timestamp at all.

If you want to do accurate datetime arithmetic, IMO you should never use the date* fields. Instead, you can extract the necessary information from _time, as you did in your example search. For example

yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay

You can also use the relative_time function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:

yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth

View solution in original post

halkelley
Path Finder

I think your macro needs to have:

        eval  date_wday = strftime(_time, "%A")

for date_wday (capital "A") to work correctly

0 Karma

lguinn2
Legend

Yes, "%a" gives the DOW abbreviation, not the full name of the DOW.

lguinn2
Legend

This is expected behavior and is described in the manual here. The date* fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.

_time is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time field in Splunk, even if the input stream has no timestamp at all.

If you want to do accurate datetime arithmetic, IMO you should never use the date* fields. Instead, you can extract the necessary information from _time, as you did in your example search. For example

yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay

You can also use the relative_time function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:

yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth

lguinn2
Legend

You could create a macro to help your user base. I'd call the macro something like normalize_datetime and have it calculate/override the existing date* fields. This macro could also be used on data that has no date* fields, providing an added benefit.

eval date_hour = strftime(_time, "%H") | eval date_mday = strftime(_time, "%d") | eval date_minute = strftime(_time, "%M") | eval date_month = strftime(_time, "%m") | eval date_second = strftime(_time, "%S") | eval date_wday = strftime(_time, "%a") | eval date_year = strftime(_time, "%y") | eval date_zone = "UTC"

lguinn2
Legend

Update: you may find that these fields do exist for inputs that do not use text timestamps. So far, I have not figured out how to remove the date_* fields from automatic field extraction - and I have looked... I completely understand the comment "some of my user base will have a harder time"

0 Karma

kbrady
Explorer

Fair enough, at least it's documented. Not a big deal for me to stop using the date_* fields, but some of my user base will have a harder time.

Thank you for a solid explanation.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...