In the process of trying to verify some summary index data I've noticed that timechart
does not seem to return expected results when using the earliest
and latest
functions.
Example data:
indextime _time Value
1438019839 2015-07-27 11:03:27 173755
1438019838 2015-07-27 11:03:10 173755
1438019838 2015-07-27 11:03:09 173755
1438019836 2015-07-27 11:03:05 173750
1438019838 2015-07-27 11:02:46 173750
1438019834 2015-07-27 11:02:29 173750
1438019833 2015-07-27 11:02:28 173750
1438019834 2015-07-27 11:02:24 173747
1438019834 2015-07-27 11:01:56 173747
1438019832 2015-07-27 11:01:39 173747
1438019834 2015-07-27 11:01:39 173747
1438019832 2015-07-27 11:01:33 173727
1438019832 2015-07-27 11:01:15 173727
1438019831 2015-07-27 11:00:58 173727
1438019832 2015-07-27 11:00:56 173727
1438019831 2015-07-27 11:00:52 173717
1438019831 2015-07-27 11:00:32 173717
1438019831 2015-07-27 11:00:14 173717
1438019831 2015-07-27 11:00:13 173717
1438019831 2015-07-27 11:00:09 173712
I've included indextime as I thought it might be relevant. But note that sorting by indextime does not change the earliest and latest values.
Running a timechart
using earliest
and latest
against this data yields results which are clearly incorrect.
| timechart span=1d earliest(Value) as earliestValue, latest(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue
_time earliestValue latestValue maxValue minValue
2015-07-27 173755 173755 173755 173712
While stats produces the correct output...
| stats earliest(Value) as earliestValue, latest(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue
earliestValue latestValue maxValue minValue
173712 173755 173755 173712
Interestingly, using first
and last
inplace of latest
and earliest
with timechart
does produce the correct output.
| timechart span=1d last(Value) as earliestValue, first(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue
_time earliestValue latestValue maxValue minValue
2015-07-27 173712 173755 173755 173712
I've searched through the docs and can't find any mention of why this could be occurring. I presume there is some internal reason why timechart
functions this way, but it's very counter-intuative and not at all clear. Does anyone know why the earliest
and latest
functions work this way with timechart
?
Running Splunk 6.2.4 on Oracle Enterprise Linux 6.5.
Update:
Results of | table _time Value
per @somesoni2's request.
_time Value
2015-07-27 11:03:27 173755
2015-07-27 11:03:10 173755
2015-07-27 11:03:09 173755
2015-07-27 11:02:46 173750
2015-07-27 11:03:05 173750
2015-07-27 11:01:39 173747
2015-07-27 11:02:24 173747
2015-07-27 11:02:29 173750
2015-07-27 11:01:56 173747
2015-07-27 11:02:28 173750
2015-07-27 11:01:33 173727
2015-07-27 11:01:39 173747
2015-07-27 11:01:15 173727
2015-07-27 11:00:56 173727
2015-07-27 11:00:58 173727
2015-07-27 11:00:14 173717
2015-07-27 11:00:13 173717
2015-07-27 11:00:52 173717
2015-07-27 11:00:32 173717
2015-07-27 11:00:09 173712
It should help to consider how bucketing
works for timechart (read the dox on bucket
, AKA bin
). When you tell timechart
to bucket
with span=1d
, Splunk modifies every event's _time
value and changes it (for this search) from whatever it used to be to 0d@d
which is exactly at midnight: 00:00:00.000. Once this has happened, it may be unknown/undefined/unpredictable how any version of Splunk will select a single "winner" for "earliest" when all events for "today" now have exactly the same timestamp. It should be that timechart
calculates earliest
and latest
before it modifies _time
but perhaps there is a reason that it cannot. IMHO, the situation is either a code bug or a documentation bug (not mentioning this aspect) so I would open a support ticket.
But I have 1 caveat: if you are bucketing twice in a row (e.g. ... | bucket _time span=1h ... | timechart span=1h earlirlest(value) ...
) then you absolutely cannot fault Splunk for being unable to get the right answer because the bucket
changes to _time
mean that the timechart
has no reliable reference point to break the ties correctly. Are you doing 2 bucketing
commands like this?
It should help to consider how bucketing
works for timechart (read the dox on bucket
, AKA bin
). When you tell timechart
to bucket
with span=1d
, Splunk modifies every event's _time
value and changes it (for this search) from whatever it used to be to 0d@d
which is exactly at midnight: 00:00:00.000. Once this has happened, it may be unknown/undefined/unpredictable how any version of Splunk will select a single "winner" for "earliest" when all events for "today" now have exactly the same timestamp. It should be that timechart
calculates earliest
and latest
before it modifies _time
but perhaps there is a reason that it cannot. IMHO, the situation is either a code bug or a documentation bug (not mentioning this aspect) so I would open a support ticket.
But I have 1 caveat: if you are bucketing twice in a row (e.g. ... | bucket _time span=1h ... | timechart span=1h earlirlest(value) ...
) then you absolutely cannot fault Splunk for being unable to get the right answer because the bucket
changes to _time
mean that the timechart
has no reliable reference point to break the ties correctly. Are you doing 2 bucketing
commands like this?
The bucketing is an excellent thought and this seems likely to be the cause of the issue. In further testing, if I add a "| sort + Value" before the timechart the output changes...
_time earliestValue latestValue maxValue minValue
2015-07-27 173712 173712 173755 173712
If bucketing were not the issue (e.g. timestamps has not been modified before the earliest and latest functions run), then the sort would have no effect on the timechart output.
Can you try putting a " | table _time Value" before the timechart and see the result?
I've added results per your request. I assume you wanted to see the table output as it's returned from Splunk (rather than being sorted by time). I also tested adding the table before the timechart, but this had no effect on the timechart output.