When we started using Splunk a couple of years ago, we needed to calculate mean on various time windows (5, 20, 60) minutes.
The _time variable after the bucket (bin) operation would be the ending time of the time slot for each new row:
bucket times accordingly:
00:00:001 through 00:05:00 ---> bucket 00:05
00:05:001 through 00:10:00 ---> bucket 00:10
search .... | bin _time minspan=5m | ...
So 10:06, 10:07, 10:08 would all be in the 10:10 bucket. When I ran it today (6.5.2), The new _time value was for the previous time slot 10:05. This makes no sense to me. Why label events that occurred after time for that time? I experimented with hours and found the same thing. However when I upped it to days (span=1d), it used the correct current day.
Is there a way to tell it to bucket the times on the ending value instead of the starting value?
It's always been in the start _time
of the bucket
for me, but that's only splunk in the versions above 6's.
Notice, you are asking for different behavior for days than you are for minutes.
If you applied the same logic to days, anything you do today will have tomorrow's date!
Same for hours. Standard bin
will give 11:00 all times between 11:00:00 and up to but not including 12:00:00.
If you really want to use the end-time
for _time
, then you have two tweaks to do:
1) subtract an infinitesimal amount (like a microsecond) from the _time
before the bin
, if you want to be sure that events exactly at 10:45:00 will end up in the 10:45:00 bucket.
2) add the bin
size to the in after binning.
| eval _time = _time -.000001
| bin _time span=5m
| eval _time = _time + 300
Usually, if it ever matters fro presentation and adding clarity, I just add another field for the end_time
of the bin
.