Is it possible to get your current timezone with an
eval search command?
Background: I'm trying build a search that will get events over the last couple weeks that all occurred on a certain hour of the day. (I'm looking to compare hourly performance and alert if performance deviates too much from a historical baseline). So it would seem that I should be able to most efficiently search using a
date_hour=n search criteria. (Of course, I want to be able to schedule this to run hourly, and I would prefer to not make 24 of these searches, one for each hour of the day.) So I came up with the idea of using a macro, which almost works, except for one small timezone problem.
Here is my entry in
[this_hour(1)] args = reltime definition = "date_hour=" . replace(strftime( relative_time(time(), "$reltime$"), "%H"), "^0", "") iseval = 1
This basically gets the current wall-clock time using the
time() function and formats the value to get just the hour, then drops the leading "0", and converts the search expression "datehour=", exactly what I wanted, well almost. The problem is that ```datehour
seems to correspond to local time of the event, and not UTC time, which is whatnow()``` returns.
So currently, the best I can come was to add the
reltime argument, which let's me manually adjust UTC to local time, but that only works part of the year, thanks to daylight savings time.
Example, for EDT (-0400) timezone:
eventtype=my_events `this_hour(+4h@h)` | stats ...
Anyone have any ides on how to make this type of macro work without hard-coding timezone information like this? (Or other ways to search for hourly-sliced windows like this)
My ultimate goal here is to compare performance across a several different axises. So let's say right now is 10 AM on a Thursday. I want to be able to answer the following questions: How does the performance right now compare to the performance of the last 7 days? How does the current performance compare to the 10AM performance of the last month? And how does the performance compare to the last 5 Thursdays?
So I would like to be able to (as efficiently as possible) capture these different ranges within a single search. (I actually just realized today that you can now search across multiple time ranges within a single search.) I'm thinking that a search to capture all the stats will look something like this:
index=summary source=my_perf_metrics series=my_series (latest=@d earliest=-7d@d) OR (date_hour=10 earliest=-1mon@d) OR (date_wday=thursday earliest=-35d@d) | stats ...
"date_hour=10" part is what I would like to replace with my
In the end, this search will probably be used to create a baseline performance csv file which is generated at the top of every hour. Then a subsequent search (running probably every 5 minutes) could be use to determine to compare the current performance against the baseline values, and if the performance is out of range, then it can be reported.
The logic of combining the recent (last 7 days) stats, with the hourly stats, and with the day-of-week performance stats is something I have to nail down. (That is, assuming I can get the core search to work.) One simple approach I've been thinking about is simply assigning a weight to the different categories (hourly vs day-of-week vs other).... so yeah, it's a work-in-progress.
Hmm, I don't think that will work for my use case. I think I would have to use this in conjunction with the
map command, and I would rather do everything within a single search. So I think I'm limited to whatever I can do in a "eval" command. But thanks for the suggestion.
date_hour is the wrong field to use in this case. Splunk internally already stores times in
_time in UTC. You can report on hourly slices by simply using
timechart span=1h, or
bucket span=1h _time. Since this operates in the absolute time, you don't need to worry about normalizing.
(It is important that your original data is indexed with the correct time zone in the first place for this to work.)
Basically, you want something like:
blah blah | bucket span=1h _time | stats average(my_perf_value) as this_hours_perf by _time
and then maybe add
... | where this_hours_perf < 1345 to just see when it was below a threshhold.
Still not totally clear to me why you'd need this, but maybe it makes sense. You can get the local time zone offset in minutes (which does take into account DST), with:
... | eval z=strftime(now(),"%z") | eval m=substr(z,-2) | eval h=substr(z,2,2) | eval mzone=((h*60)+m)*(z/abs(z))
Or obviously if you just care about the hour offset, you can just use
h*(z/abs(z)), or to get the current local hour from current UTC, either
I think that trending my data based on local time would be more accurate than UTC based on the fact that all our scheduled jobs run based on local time. In other words, a log-intensive jobs run at 7:00 PM EST or at 7:00 PM EDT which is two different hours based on UTC, but the same
date_hour based on local time. I want to see the 7PM spike. (Of course this only matters a few weeks after the time changes, but that's why I think
date_hour is a better metric based on my data.)
date_hour means I can quickly filter out 23 hours of the day at the very beginning of my search. And over a 2 week span this adds up. (In my question, I only asked about hours. But I'm also looking to do some analysis based on the day of week as well, so, for example, I'm ultimately looking to combine metrics based on a few different axises? (axi,axes? whatever): average over the whole last week, average based on the hour of the day for the last 3 few week, plus day of week over the last few months... I'm considering putting this in a single search...)
date_zone includes the time zone information for each event, but it is not a good idea to use
date_hour + (date_zone/60) because: (a) _time gives you the same information, and is an indexed field (in fact how the entire Splunk index is organized) while the above computation must be done for every event; (b) time zones aren't always multiples of 60 minutes.
Okay, but if you do want to trend over datehour, you can just use `datehour`, right? there's no need to convert it for time zone?
That's true, I could just do the
stats ... by date_hour, but then I'm back to my original problem: How do I know what hour I'm currently in if I can' convert
now() to a different timeszone? (This just occured to me, I guess I could do something like
my search | eventstats first(date_hour) as current_hour | ..... | stats ... by date_hour | where date_hour=first_hour but that's seems kludgy...)
I update the question to better explain my overall goals. Perhaps that will help. I agree with your points about doing math against
date_zone, if I'm going to use the
date_* fields, it's primarily because they are indexed and super fast to search on, and doing any match on them voids that advantage. 100% accuracy isn't important in my use case. In any case, I'm heading out for the day. Thanks a ton for all of your help on this. 😉 Have a good one!