Splunk Search

Why do the earliest and latest timechart functions produce unexpected results?

Path Finder

In the process of trying to verify some summary index data I've noticed that timechart does not seem to return expected results when using the earliest and latest functions.

Example data:

indextime    _time                Value
1438019839  2015-07-27 11:03:27 173755
1438019838  2015-07-27 11:03:10 173755
1438019838  2015-07-27 11:03:09 173755
1438019836  2015-07-27 11:03:05 173750
1438019838  2015-07-27 11:02:46 173750
1438019834  2015-07-27 11:02:29 173750
1438019833  2015-07-27 11:02:28 173750
1438019834  2015-07-27 11:02:24 173747
1438019834  2015-07-27 11:01:56 173747
1438019832  2015-07-27 11:01:39 173747
1438019834  2015-07-27 11:01:39 173747
1438019832  2015-07-27 11:01:33 173727
1438019832  2015-07-27 11:01:15 173727
1438019831  2015-07-27 11:00:58 173727
1438019832  2015-07-27 11:00:56 173727
1438019831  2015-07-27 11:00:52 173717
1438019831  2015-07-27 11:00:32 173717
1438019831  2015-07-27 11:00:14 173717
1438019831  2015-07-27 11:00:13 173717
1438019831  2015-07-27 11:00:09 173712

I've included indextime as I thought it might be relevant. But note that sorting by indextime does not change the earliest and latest values.

Running a timechart using earliest and latest against this data yields results which are clearly incorrect.

| timechart span=1d earliest(Value) as earliestValue, latest(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue

_time        earliestValue  latestValue maxValue    minValue
2015-07-27  173755         173755        173755   173712

While stats produces the correct output...

| stats earliest(Value) as earliestValue, latest(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue

earliestValue   latestValue maxValue    minValue
173712         173755        173755   173712

Interestingly, using first and last inplace of latest and earliest with timechart does produce the correct output.

| timechart span=1d last(Value) as earliestValue, first(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue

_time        earliestValue  latestValue maxValue    minValue
2015-07-27  173712         173755        173755   173712

I've searched through the docs and can't find any mention of why this could be occurring. I presume there is some internal reason why timechart functions this way, but it's very counter-intuative and not at all clear. Does anyone know why the earliest and latest functions work this way with timechart?

Running Splunk 6.2.4 on Oracle Enterprise Linux 6.5.

Update:
Results of | table _time Value per @somesoni2's request.

_time               Value
2015-07-27 11:03:27 173755
2015-07-27 11:03:10 173755
2015-07-27 11:03:09 173755
2015-07-27 11:02:46 173750
2015-07-27 11:03:05 173750
2015-07-27 11:01:39 173747
2015-07-27 11:02:24 173747
2015-07-27 11:02:29 173750
2015-07-27 11:01:56 173747
2015-07-27 11:02:28 173750
2015-07-27 11:01:33 173727
2015-07-27 11:01:39 173747
2015-07-27 11:01:15 173727
2015-07-27 11:00:56 173727
2015-07-27 11:00:58 173727
2015-07-27 11:00:14 173717
2015-07-27 11:00:13 173717
2015-07-27 11:00:52 173717
2015-07-27 11:00:32 173717
2015-07-27 11:00:09 173712
1 Solution

Esteemed Legend

It should help to consider how bucketing works for timechart (read the dox on bucket, AKA bin). When you tell timechart to bucket with span=1d, Splunk modifies every event's _time value and changes it (for this search) from whatever it used to be to 0d@d which is exactly at midnight: 00:00:00.000. Once this has happened, it may be unknown/undefined/unpredictable how any version of Splunk will select a single "winner" for "earliest" when all events for "today" now have exactly the same timestamp. It should be that timechart calculates earliest and latest before it modifies _time but perhaps there is a reason that it cannot. IMHO, the situation is either a code bug or a documentation bug (not mentioning this aspect) so I would open a support ticket.

But I have 1 caveat: if you are bucketing twice in a row (e.g. ... | bucket _time span=1h ... | timechart span=1h earlirlest(value) ...) then you absolutely cannot fault Splunk for being unable to get the right answer because the bucket changes to _time mean that the timechart has no reliable reference point to break the ties correctly. Are you doing 2 bucketing commands like this?

View solution in original post

Esteemed Legend

It should help to consider how bucketing works for timechart (read the dox on bucket, AKA bin). When you tell timechart to bucket with span=1d, Splunk modifies every event's _time value and changes it (for this search) from whatever it used to be to 0d@d which is exactly at midnight: 00:00:00.000. Once this has happened, it may be unknown/undefined/unpredictable how any version of Splunk will select a single "winner" for "earliest" when all events for "today" now have exactly the same timestamp. It should be that timechart calculates earliest and latest before it modifies _time but perhaps there is a reason that it cannot. IMHO, the situation is either a code bug or a documentation bug (not mentioning this aspect) so I would open a support ticket.

But I have 1 caveat: if you are bucketing twice in a row (e.g. ... | bucket _time span=1h ... | timechart span=1h earlirlest(value) ...) then you absolutely cannot fault Splunk for being unable to get the right answer because the bucket changes to _time mean that the timechart has no reliable reference point to break the ties correctly. Are you doing 2 bucketing commands like this?

View solution in original post

Path Finder

The bucketing is an excellent thought and this seems likely to be the cause of the issue. In further testing, if I add a "| sort + Value" before the timechart the output changes...

_time       earliestValue   latestValue maxValue    minValue
2015-07-27  173712          173712      173755      173712

If bucketing were not the issue (e.g. timestamps has not been modified before the earliest and latest functions run), then the sort would have no effect on the timechart output.

0 Karma

SplunkTrust
SplunkTrust

Can you try putting a " | table _time Value" before the timechart and see the result?

0 Karma

Path Finder

I've added results per your request. I assume you wanted to see the table output as it's returned from Splunk (rather than being sorted by time). I also tested adding the table before the timechart, but this had no effect on the timechart output.

0 Karma