Getting Data In

What are the MaxDataSize recommendations for a single indexer?

ctaf
Contributor

Hi,

We usually say that if we index more than 10GB per day per index, we should put maxDataSize = auto_high_volume

But does that apply to one indexer or the whole cluster?

In other words, if I received 15GB per day for index "main", but I have 4 clustered indexers (3.75GB per indexer), should I still put maxDataSize = auto_high_volume?

Thanks!

0 Karma

harsmarvania57
Ultra Champion

Hi @ctaf,

I'll recommend to keep maxDataSize = auto_high_volume even if you are ingesting only 3-4 GB per indexer per day. When you execute search query splunk will try to find data in different buckets so if you have less number of buckets splunk will return results quickly compare to many buckets with smaller size.

Only cons with auto_high_volume is single bucket will store 3-4 days data based on your ingestion per indexer per day and until and unless all events in single bucket will reach their retention period or you hit with maxTotalDataSizeMB for that particular index whicever is earlier those bucket will not remove due to this splunk will use more storage.

I hope this clears your query.

Thanks,
Harshil

0 Karma

ctaf
Contributor

Hi @harsmarvania57,
Thanks, I understand the implications of this setting. But I am still wondering what are the official recommandations? We say that auto_high_volume is for 10GB+/day. But does that mean per indexer or in total?

0 Karma

harsmarvania57
Ultra Champion

When you apply this setting in Indexer Cluster, it will apply to per indexer not in total. For official recommendations you might need to contact splunk support or ps. But I am using auto_high_volume for indexes which is sending more than 3-4GB/day per indexer.

0 Karma
Get Updates on the Splunk Community!

Get Early Access to AI Playbook Authoring: Apply for the Alpha Private Preview ...

Passionate about security automation? Apply now to our AI Playbook Authoring Alpha private preview ...

Reduce and Transform Your Firewall Data with Splunk Data Management

Managing high-volume firewall data has always been a challenge. Noisy events and verbose traffic logs often ...

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...