Getting Data In

How to configure Splunk to convert multiple datetime attributes within the same data source to epoch at index-time and not treat them as strings?

Path Finder

Hi,

In our data source (an application log file), we have multiple datetime attributes (say update_time, order_time, delivery_time in that order). When Splunk indexes this data, it is automatically taking order_time which is appearing first in the log as the event timestamp and indexing accordingly which is fine. However the other datetime attributes are being read as Strings and not as datetime attributes. We are having to convert them using the convert function before applying certain calculations such as finding the time taken between delivery and ordering. Another issue is that these values do not have the timezone component, so if a user changes the user level timezone setting, these values do not get changed as they are treated as text.

Is there a way to let Splunk treat them as datetime attributes and convert them to epoch during indexing itself ?

0 Karma

Esteemed Legend

You only need to normalize your times if you are using the _time variable, which you can easily (and should) avoid. You should only be interacting with _time through the timepicker to select your set of events and after that you should use the extracted field names. For example, if you need transaction_seconds, then you would add | eval transaction_seconds = order_time - coalesce(delivery_time, update_time, order_time). If you really need to normalize to epoch, then you need to do all of the following:

1: Set your personal user settings for TZ to the correct value for you (not strictly necessary but will help avoid confusion).
2: When using fieldformat (which is generally preferable to convert) or convert, be sure to add the date_zone field to ALL the timestamp fileds (except for _time) like this:

| eval normalized_update_time = updated_time + date_zone | convert timeformat="%m-%d-%Y %H:%M:%S" ctime(normalized_update_time)

3: Splunk should be able to handle the timestamp fields as-is but if you find that it is not, then just use tonumber first like this:

| eval normalized_update_time = tonumber(updated_time) - date_zone | convert timeformat="%m-%d-%Y %H:%M:%S" ctime(normalized_update_time)`

4: Keep in mind that now you are dealing with times that are all normalized to GMT, NOT to your timezone. Regardless of your setting in setp #1, the convert (and other) commands present the data to you in GMT which is good so that no matter who runs the commands, they present the same values. So the last step is to normalize one more time to your TZ like this (example is for Central Time which is 6hours earlier):

| eval normalized_update_time = tonumber(updated_time) - date_zone - (6*60*60) | convert timeformat="%m-%d-%Y %H:%M:%S" ctime(normalized_update_time)`

The only problem in this is if/when date_zone is local which means you need to know the host OS value for TZ on the splunk_server that indexed it (or on the Heavy Forwarder if you are using HF) and you need to accommodate this accordingly.

Some of these steps could be avoided if Splunk provided a command (or an option to a command) to consider each user profile's TZ setting in presenting the output to the user.

See the following for more info;

http://docs.splunk.com/Documentation/Splunk/6.2.3/SearchReference/Commontimeformatvariables
http://docs.splunk.com/Documentation/Splunk/6.2.3/SearchReference/CommonEvalFunctions
http://www.strftime.net

0 Karma