Splunk Enterprise

Bucket Roll Over

silverKi
Path Finder
indexes.conf

[volume:hot]
path=/mnt/splunk/hot
maxVolumeDataSizeMB = 40

[volume:cold]
path = /mnt/splunk/cold
maxVolumeDataSizeMB = 40

[A]
homePath = volume:hot/A/db
coldPath = volume:cold/A/colddb
maxDataSize = 1
maxTotalDataSizeMB = 90
thawedPath = $SPLUNK_DB/A/thaweddb

[_internal]
homePath = volume:cold/_internaldb/db
coldPath = volume:cold/_internaldb/colddb
thawedPath = $SPLUNK_DB/_internaldb/thaweddb
maxDataSize = 1
maxTotalDataSizeMB = 90




I collected data from each index, and the percentage stored in cold volume was A=30MB, _internaldb/db=10MB. This was understood to account for a larger percentage because the data volume and collection speed of the A index was larger and faster than that of _internal collection.


If you stop collecting data from the A index and maintain data collection only for the _internal index, the old buckets in _internaldb/db will be moved to _internaldb/colddb in the order they were loaded in _internaldb/db, and will not be maintained in colddb in the order in which they were loaded in _internaldb/db, but will be immediately deleted. Additionally, data that existed in A/colddb is deleted in oldest order. I understood that the cold volume is limited to 40 and the cold volume is already full, so it will not be maintained in _internaldb/colddb and will be immediately deleted.

However, why is the data in A/colddb deleted? Afterwards, when the A/colddb capacity reaches 20, A/colddb is not deleted.


The behavior I expected was that if A/colddb capacity is deleted until it becomes 0, the old buckets in _internaldb/db would be moved to _internaldb/colddb and then maintained. I'm curious why the results are different from what I expected, and if maxTotalDataSizeMB is the same, the Volume maintains the same ratio.

0 Karma

PickleRick
SplunkTrust
SplunkTrust
homePath = volume:cold/_internaldb/db

Are you sure you want your hot data on the cold volume?

Anyway, you have small buckets, which is relatively rare for a Splunk installation and it skews your observations. There is no guarantee that the limits will be enforced precisely.

Anyway, it works like this - every now and then (I don't remember the exact interval; you can find it in servers.conf) the housekeeping thread wakes up and checks the indexes.

If a hot bucket triggers criteria (bucket size, inactivity time and so on), it is rolled to warm.

If warm buckets for index trigger cirteria (number of warm buckets per index), oldest bucket for an index (in terms of most recent event in the bucket) is rolled to cold.

If hot/warm volume exceeds size, oldest bucket for the whole volume is rolled to cold.

If cold buckets for index trigger criteria (retention time, data size), oldest bucket is rolled to frozen

If cold volume exceeds size, oldest bucket for the whole volume is rolled to frozen.

That's how it's supposed to work.

0 Karma

silverKi
Path Finder

@PickleRick 

So shouldn’t the A/colddb capacity become 0 at some point?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Depends on what you mean by "capacity". Splunk doesn't have "capacity" as such. It doesn't have its own limit above which it won't index events. It will simply freeze old data if it exceeds some limits. (ok, technically it will stop ingesting if the underlying storage becomes full and OS-level writes cannot be done anymore but that's a different story).

Buckets from A/colddb might get frozen at some point because while the A index does not exceeds its thresholds, the overall size of buckets on the volume is too big and A's buckets are oldest.

0 Karma

silverKi
Path Finder

@PickleRick

What I understand is that when data collection of the A index is stopped and only the data collection of the _internal index is maintained, A/colddb collected by collecting data from the two indexes was 30MB, and in _internaldb/db 10MB, only the data of the _internal index is collected. Since A/colddb is the oldest in the cold Volume, is it reduced from 30MB to 20MB? With this logic, wouldn't all data stored in A/colddb be frozen?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Which buckets get frozen is decided by bucket's age. Bucket's age is the time of the most recent event in the bucket. But. There can be buckets which even have events from the future (they go to so called quarantine bucket). So it's a bit more complicated and depends on data characteristics of indexes involved.

dbinspect is the command to check for info on your buckets.

0 Karma
Get Updates on the Splunk Community!

Data Management Digest – December 2025

Welcome to the December edition of Data Management Digest! As we continue our journey of data innovation, the ...

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...