Deployment Architecture

Movement of buckets in an indexer cluster

Communicator

Hello All, today someone asked me a question about bucket movement in an indexer cluster. Lets say i have 5 indexer in a cluster and i have an index called operations. My question is as follows. Assuming that the replication is happening properly,
When my hot bucket rolls to warm, What happens to the rest of the buckets that are replicated on the indexers?
When my warm bucket rolls to cold, What happens to the rest of the buckets that are replicated on the indexers?
When my cold bucket rolls to frozen, What happens to the rest of the buckets that are replicated on the indexers?
i read a lot of articles about bucket movement but i cannot find a proper explanation for this. any help in shedding some light on this topic will be highly appreciated.
Thanks

1 Solution

Esteemed Legend

Full bucket replication only happens for warm buckets (hot buckets are replicated in slices in a not-clearly documented way). Bucket rolling is an independently-enforced exercise but the Cluster Master is watching. So hot bucket X for index Y on indexer Z is not replicated as a full bucket because it is being written to. But when it rolls to warm, it will be replicated as per your replication and time/size settings. When it rolls to cold nothing changes. However when it freezes, the RF/SF will only be enforced if there is space somewhere for it, which is unlikely. As far as what happens to buckets in other index values, index Y has no effect on index P except as you would expect under the all-index volume-based settings where deletion of oldest bucket Q (across all buckets in all indices) will be enforced to make room for new buckets (including replication buckets), but again, each indexer enforces deletion based on his own settings and situation, independently of the other indexers and outside of any input from the Cluster Master.

View solution in original post

0 Karma

Esteemed Legend

Full bucket replication only happens for warm buckets (hot buckets are replicated in slices in a not-clearly documented way). Bucket rolling is an independently-enforced exercise but the Cluster Master is watching. So hot bucket X for index Y on indexer Z is not replicated as a full bucket because it is being written to. But when it rolls to warm, it will be replicated as per your replication and time/size settings. When it rolls to cold nothing changes. However when it freezes, the RF/SF will only be enforced if there is space somewhere for it, which is unlikely. As far as what happens to buckets in other index values, index Y has no effect on index P except as you would expect under the all-index volume-based settings where deletion of oldest bucket Q (across all buckets in all indices) will be enforced to make room for new buckets (including replication buckets), but again, each indexer enforces deletion based on his own settings and situation, independently of the other indexers and outside of any input from the Cluster Master.

View solution in original post

0 Karma

Communicator

Correct me if im wrong here, So basically the following happens
1. When data is received for the first time, it creates a hot bucket. and no replication happens.
2. When the bucket moves from hot to warm, cluster master is notified and replication happens and all the other indexers gets a copy of the bucket.
3. When the bucket moves to cold no changes happen across the cluster.
4. For movement of data from cold to frozen, the following happens? i got this from splunk docs..

 when a primary copy freezes, the cluster reassigns the primary to another searchable copy, if one exists. Searching then continues on that bucket with the new primary copy. When that primary also freezes, the cluster attempts to reassign the primary yet again to another searchable copy. Once all searchable copies of the bucket have been frozen, searching ceases on that bucket. when a peer freezes a copy of a bucket, it notifies the master. The master then stops doing fix-ups on that bucket. It operates under the assumption that the other peers will eventually freeze their copies of that bucket as well. If the freezing behavior is determined by the maxTotalDataSizeMB attribute, which limits the maximum size of an index, it can take some time for all copies of the bucket to freeze, as an index will typically be a different size on each peer. Therefore, the index can reach its maximum size on one peer, causing the oldest bucket to freeze, even though the index is still under the limit on the other peers.

Also i have another doubt, when the replication is happening in splunk, and indexer Y gets a bucket because indexer X rolled it from hot to warm, does indexer Y treat it as warm bucket or there is no concept of hot/warm/cold during replication?

0 Karma

Esteemed Legend

Replication happens for warm and cold but mos of the cold activity is when buckets get reassigned as primary when the primary on another indexer freezes whereas for warm, there are constantly new buckets getting created, needing to be replicated. You only see cold replication when you lose an indexer and then both cold and warm go CRAZY. This is why you need to be sure that you enable maintenance mode when taking an indexer down because the Cluster Master is VERY aggressive about maintaining RF/SF and even a small scheduled outage can freeze (delete) thousands of buckets FOREVER to make room for extra/unneeded replication buckets which are younger.

0 Karma