I've always been very careful in setting my indexes sizes to be something along the lines of 1.1* <peak indexed volume>*<retention-period>
, but recently I've been wondering if there's any reason to worry about index sizes ( e.g. just set them all to be the full available space or some random very high value) if you have the following:
retention periods for freezing (and thus offloading from valuable high IOPs diskspace) that are such that the total across all indexes is less significantly less than the space you have available
minimum free disk space configured.
The way I see it, I can't actually think of many reasons to have the max index size configured for any index as long as at the very least the minimum free disk space is configured. Maybe to limit lower priority data from filling up space that could be used for more valuable data (seems like a different issue altogether).
Is there perhaps a performance or best practice reason?
This is not about changing number of buckets and bucket sizes related to indexing, but simply the maximum index size.
I've had issues where a developer accidentally wrote an infinite loop in a debugging log and wrote 10GB in 3 hours of log data, this then happened multiple times in 1 day resulting in various issues.
Since the index size was limited it only caused damage to the non-prod index related to this group of users.
If this was not set the volume limit (which was also in use) would have kicked in but recent data would have been lost as the oldest data would get deleted first ! Therefore I would recommend using the maximum size of an index to prevent the flood of data scenario I am describing...