When I use timechart, if some trailing buckets have zero count, they are displayed as zero on the time axis that extends to the end of search window. But in the same time window, if I use chart over _time, trailing zero-count buckets are removed. For example,
index = _internal earliest=-3h@h latest=+3h@h ``` simulate trailing zero-count buckets```
| timechart span=1h count
This gives
_time | count |
2023-10-19 05:00 | 33798 |
2023-10-19 06:00 | 33798 |
2023-10-19 07:00 | 33949 |
2023-10-19 08:00 | 27416 |
2023-10-19 09:00 | 0 |
2023-10-19 10:00 | 0 |
Note the last two buckets are zero-count. Whereas this
index = _internal earliest=-3h@h latest=+3h@h ``` simulate zero-count buckets ```
| bucket _time span=1h
| chart count over _time
gives
_time | count |
2023-10-19 05:00 | 33798 |
2023-10-19 06:00 | 33798 |
2023-10-19 07:00 | 33949 |
2023-10-19 08:00 | 27438 |
The two trailing buckets are not listed, even though info_max_time is exactly the same.
Is there a way to force chart to list all _time buckets between info_min_time and info_max_time?
The timechart generates time series for selected time range, so you get data for full time window, even when there are no results for certain buckets.
The chart command, like stats command, generates statistics for available _time buckets only, so if a time bucket has 0 events, it'll will not show it (can't generate if it's not present).
There are workaround to get full time series with chart as well but it's not that pretty. If timechart is an option, use that.
Here is the workaround query:
index = _internal ``` simulate zero-count buckets ```
| bucket _time span=5m
| chart count over _time
| append [| makeresults | addinfo
| eval time=mvrange(info_min_time, info_max_time+1,300)
| rename comment as "third argument should in seconds and same as the span you selected for chart"
| table time | mvexpand time
| rename time as _time | eval count=0]
| chart sum(count) as count by _time
The timechart generates time series for selected time range, so you get data for full time window, even when there are no results for certain buckets.
The chart command, like stats command, generates statistics for available _time buckets only, so if a time bucket has 0 events, it'll will not show it (can't generate if it's not present).
There are workaround to get full time series with chart as well but it's not that pretty. If timechart is an option, use that.
Here is the workaround query:
index = _internal ``` simulate zero-count buckets ```
| bucket _time span=5m
| chart count over _time
| append [| makeresults | addinfo
| eval time=mvrange(info_min_time, info_max_time+1,300)
| rename comment as "third argument should in seconds and same as the span you selected for chart"
| table time | mvexpand time
| rename time as _time | eval count=0]
| chart sum(count) as count by _time
Using mvrange with time! I think you also gave me this a long time ago for a different question, but with a unit instead of directly with _time. (mvexpand with info_max_time - info_min_time is too much.)
Combining that lesson (thanks again!) and this formula, and working out some Splunk kinks, I can make it work with simple count.
To start, I also realize that addinfo in makeresults will not work the same way as in a search command. So, I modified my simulation strategy a little. This will be my new baseline:
index = _internal
| where _time < relative_time(now(), "-2h@h") ``` simulate zero-count buckets ```
| timechart span=1h count
The complete workaround will be
index = _internal
| where _time < relative_time(now(), "-2h@h") ``` simulate zero-count buckets ```
| bucket _time span=1h@h
| chart count over _time
| append
[| makeresults | addinfo
| eval hours = mvrange(0, round((info_max_time - info_min_time) / 3600))
| eval time = mvmap(hours, info_min_time + hours * 3600)
| table time | mvexpand time
| rename time as _time
| bucket _time span=1h@h
| eval count=0]
| stats sum(count) as count by _time
Then, I should have noted in OP that my chart has a groupby clause. So, I move my baseline to
index = _internal sourcetype IN (splunkd, splunkd_access, splunkd_ui_access)
| where _time < relative_time(now(), "-2h@h") ``` simulate zero-count buckets ```
| timechart span=1h count by sourcetype
The workaround with groupby therefore is
index = _internal sourcetype IN (splunkd, splunkd_access, splunkd_ui_access) ``` simulate zero-count buckets ```
| where _time < relative_time(now(), "-2h@h")
| bucket _time span=1h@h
| chart count over _time by sourcetype
| append
[| makeresults | addinfo
| eval hours = mvrange(0, round((info_max_time - info_min_time) / 3600))
| eval time = mvmap(hours, info_min_time + hours * 3600)
| table time | mvexpand time
| rename time as _time
| bucket _time span=1h@h
| foreach splunkd, splunkd_access, splunkd_ui_access
[eval <<FIELD>> = 0]]
| chart sum(*) as * by _time
This is super messy; it can be daunting if there are many values in groupby, or if values are unpredictable. As you said, I should try to stick to timechart when dealing with time series.