We usually say that if we index more than 10GB per day per index, we should put maxDataSize = auto_high_volume
But does that apply to one indexer or the whole cluster?
In other words, if I received 15GB per day for index "main", but I have 4 clustered indexers (3.75GB per indexer), should I still put maxDataSize = auto_high_volume?
I'll recommend to keep maxDataSize = auto_high_volume even if you are ingesting only 3-4 GB per indexer per day. When you execute search query splunk will try to find data in different buckets so if you have less number of buckets splunk will return results quickly compare to many buckets with smaller size.
Only cons with
auto_high_volume is single bucket will store 3-4 days data based on your ingestion per indexer per day and until and unless all events in single bucket will reach their retention period or you hit with maxTotalDataSizeMB for that particular index whicever is earlier those bucket will not remove due to this splunk will use more storage.
I hope this clears your query.
Thanks, I understand the implications of this setting. But I am still wondering what are the official recommandations? We say that auto_high_volume is for 10GB+/day. But does that mean per indexer or in total?
When you apply this setting in Indexer Cluster, it will apply to per indexer not in total. For official recommendations you might need to contact splunk support or ps. But I am using auto_high_volume for indexes which is sending more than 3-4GB/day per indexer.