(Splunk version is 6.2.5)
We had an issue with index getting too large and not getting cleaned according to its "maxTotalDataSizeMB" factor.
Later we found that the index had a lot of excess buckets replicated from a unexpected crash earlier.
Question is, does the "maxTotalDataSizeMB" goes to all the buckets, including excess?
If not, I believe it should be more documented since I couldn't find anything about it in the docs.
The size of all buckets in HOT/WARM/COLD is used to enforce maxTotalDataSizeMB, both in a non-clustered as well as a clustered indexer environment. In the latter, this would include replicated buckets.
Please add a comment on the documentation page where you would expect this to be explicitly stated.
Thanks, but I didn't mean replicated buckets (rb_*), I meant excess buckets, that created when one peer goes down, and not needed once it's up again.
I'm would expect to see reference to this behavior in on of these docs:
This is the answer from Steve Goodman after commenting the doc:
The maxTotalDataSizeMB parameter determines the maximum size of the index. When the index reaches that size, the oldest buckets roll to frozen. In an indexer cluster, the index size is independent of the number of copies of any particular bucket. Rather, it's the combined size of one searchable copy of each bucket.
The excess bucket copies that are discussed in this topic are excess copies of legitimate buckets; that is, buckets that have not yet rolled to frozen (for example, because maxTotalDataSizeMB has been reached).
These excess buckets exist due to cluster fix-up activities that occur when a peer goes down for some amount of time. The fix-up activities cause the cluster to create new copies of buckets that were residing on the downed peer, in order to meet the cluster replication and search factors. When the downed peer later rejoins the cluster, it still has those bucket copies, which results in excess copies of some buckets.
For example, assume the cluster has a replication factor of 3, meaning that each bucket should have three copies. If a peer with a copy of bucket1 goes down, the cluster will add a copy of bucket1 to some other peer, so that the cluster again has three copies. if the peer then rejoins the cluster, the cluster will now be holding four copies of bucket1. Since the cluster only needs three copies to fulfill its replication factor, it has one excess copy.
In the case where a peer has been down for some period of time, such that certain buckets have been rolled to frozen during the interim and the peer is still holding copies of those buckets, then those buckets will automatically roll to frozen when the peer returns to the cluster. This is really a separate issue from the excess copy issue, though.
I hope this helps to clarify the relationship between maxTotalDataSizeMB and bucket copies."
Actually, the previous information was incorrect in certain respects. The maxTotalDataSizeMB attribute operates on each peer node individually. It does not operate across the cluster. So, each peer node tallies the size of the index for itself, based on the bucket copies it is holding for that index.
In other words, the index size for any peer node is determined by the total size of all bucket copies for that index on each peer. It doesn't matter whether the copies are primary copies, searchable copies, non-searchable copies, or excess copies. They all count toward the index size on that peer.
For more information, including how the cluster handles buckets freezing at different times on different peer nodes, see http://docs.splunk.com/Documentation/Splunk/6.3.3/Indexer/Bucketsandclusters#How_the_cluster_handles...
I also have a question about this topic 🙂 Reading the Splunk documentation is a bit misleading. Here is the snippet from the Splunk archiving policy page. If I read this correctly below, maxTotalDataSizeMB applies from cold to frozen. However- People are saying this is across all buckets. So I am a bit confused on the HOT/WARM buckets. My goal is to always retain the last 90 days of data no matter when a rollover occurs. I have set my frozen frozenTimePeriodInSecs to 90 days and my maxTotalDataSizeMB = 90GB
"Set attributes for cold to frozen rolling behavior
The maxTotalDataSizeMB and frozenTimePeriodInSecs attributes in indexes.conf help determine when buckets roll from cold to frozen. These attributes are described in detail below."
When the index reaches maxTotalDataSizeMB, the indexer freezes the oldest bucket in the index.
If you want to ensure that data freezes only after 90 days, then set frozenTimePeriodInSecs to 7776000 and set maxTotalDataSizeMB to a value larger than the maximum amount of data you will index in any 90 day period, so that it does not kick in before frozenTimePeriodInSecs. And be sure that you have sufficient storage available to hold that maximum amount of data.
Great feedback, thanks for the clarity! I have a lot of indexes but there is one critical index that I need to ensure is there every 90 days. I have been losing all data for this particular index on the roll over date.. This will be helpful! thank you...