Splunk Search

Why do the earliest and latest timechart functions produce unexpected results?

curtisb1024
Path Finder

In the process of trying to verify some summary index data I've noticed that timechart does not seem to return expected results when using the earliest and latest functions.

Example data:

indextime    _time                Value
1438019839  2015-07-27 11:03:27 173755
1438019838  2015-07-27 11:03:10 173755
1438019838  2015-07-27 11:03:09 173755
1438019836  2015-07-27 11:03:05 173750
1438019838  2015-07-27 11:02:46 173750
1438019834  2015-07-27 11:02:29 173750
1438019833  2015-07-27 11:02:28 173750
1438019834  2015-07-27 11:02:24 173747
1438019834  2015-07-27 11:01:56 173747
1438019832  2015-07-27 11:01:39 173747
1438019834  2015-07-27 11:01:39 173747
1438019832  2015-07-27 11:01:33 173727
1438019832  2015-07-27 11:01:15 173727
1438019831  2015-07-27 11:00:58 173727
1438019832  2015-07-27 11:00:56 173727
1438019831  2015-07-27 11:00:52 173717
1438019831  2015-07-27 11:00:32 173717
1438019831  2015-07-27 11:00:14 173717
1438019831  2015-07-27 11:00:13 173717
1438019831  2015-07-27 11:00:09 173712

I've included indextime as I thought it might be relevant. But note that sorting by indextime does not change the earliest and latest values.

Running a timechart using earliest and latest against this data yields results which are clearly incorrect.

| timechart span=1d earliest(Value) as earliestValue, latest(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue

_time        earliestValue  latestValue maxValue    minValue
2015-07-27  173755         173755        173755   173712

While stats produces the correct output...

| stats earliest(Value) as earliestValue, latest(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue

earliestValue   latestValue maxValue    minValue
173712         173755        173755   173712

Interestingly, using first and last inplace of latest and earliest with timechart does produce the correct output.

| timechart span=1d last(Value) as earliestValue, first(Value) as latestValue, max(Value) as maxValue, min(Value) as minValue

_time        earliestValue  latestValue maxValue    minValue
2015-07-27  173712         173755        173755   173712

I've searched through the docs and can't find any mention of why this could be occurring. I presume there is some internal reason why timechart functions this way, but it's very counter-intuative and not at all clear. Does anyone know why the earliest and latest functions work this way with timechart?

Running Splunk 6.2.4 on Oracle Enterprise Linux 6.5.

Update:
Results of | table _time Value per @somesoni2's request.

_time               Value
2015-07-27 11:03:27 173755
2015-07-27 11:03:10 173755
2015-07-27 11:03:09 173755
2015-07-27 11:02:46 173750
2015-07-27 11:03:05 173750
2015-07-27 11:01:39 173747
2015-07-27 11:02:24 173747
2015-07-27 11:02:29 173750
2015-07-27 11:01:56 173747
2015-07-27 11:02:28 173750
2015-07-27 11:01:33 173727
2015-07-27 11:01:39 173747
2015-07-27 11:01:15 173727
2015-07-27 11:00:56 173727
2015-07-27 11:00:58 173727
2015-07-27 11:00:14 173717
2015-07-27 11:00:13 173717
2015-07-27 11:00:52 173717
2015-07-27 11:00:32 173717
2015-07-27 11:00:09 173712
1 Solution

woodcock
Esteemed Legend

It should help to consider how bucketing works for timechart (read the dox on bucket, AKA bin). When you tell timechart to bucket with span=1d, Splunk modifies every event's _time value and changes it (for this search) from whatever it used to be to 0d@d which is exactly at midnight: 00:00:00.000. Once this has happened, it may be unknown/undefined/unpredictable how any version of Splunk will select a single "winner" for "earliest" when all events for "today" now have exactly the same timestamp. It should be that timechart calculates earliest and latest before it modifies _time but perhaps there is a reason that it cannot. IMHO, the situation is either a code bug or a documentation bug (not mentioning this aspect) so I would open a support ticket.

But I have 1 caveat: if you are bucketing twice in a row (e.g. ... | bucket _time span=1h ... | timechart span=1h earlirlest(value) ...) then you absolutely cannot fault Splunk for being unable to get the right answer because the bucket changes to _time mean that the timechart has no reliable reference point to break the ties correctly. Are you doing 2 bucketing commands like this?

View solution in original post

woodcock
Esteemed Legend

It should help to consider how bucketing works for timechart (read the dox on bucket, AKA bin). When you tell timechart to bucket with span=1d, Splunk modifies every event's _time value and changes it (for this search) from whatever it used to be to 0d@d which is exactly at midnight: 00:00:00.000. Once this has happened, it may be unknown/undefined/unpredictable how any version of Splunk will select a single "winner" for "earliest" when all events for "today" now have exactly the same timestamp. It should be that timechart calculates earliest and latest before it modifies _time but perhaps there is a reason that it cannot. IMHO, the situation is either a code bug or a documentation bug (not mentioning this aspect) so I would open a support ticket.

But I have 1 caveat: if you are bucketing twice in a row (e.g. ... | bucket _time span=1h ... | timechart span=1h earlirlest(value) ...) then you absolutely cannot fault Splunk for being unable to get the right answer because the bucket changes to _time mean that the timechart has no reliable reference point to break the ties correctly. Are you doing 2 bucketing commands like this?

curtisb1024
Path Finder

The bucketing is an excellent thought and this seems likely to be the cause of the issue. In further testing, if I add a "| sort + Value" before the timechart the output changes...

_time       earliestValue   latestValue maxValue    minValue
2015-07-27  173712          173712      173755      173712

If bucketing were not the issue (e.g. timestamps has not been modified before the earliest and latest functions run), then the sort would have no effect on the timechart output.

0 Karma

somesoni2
Revered Legend

Can you try putting a " | table _time Value" before the timechart and see the result?

0 Karma

curtisb1024
Path Finder

I've added results per your request. I assume you wanted to see the table output as it's returned from Splunk (rather than being sorted by time). I also tested adding the table before the timechart, but this had no effect on the timechart output.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...