Deployment Architecture

In a clustered environment, how do I figure out how much space an index is truly using?


I have a cluster of 6 nodes, replication factor of 3, search factor of 2.

As a pre-question:

In indexes.conf, when I set maxTotalDataSizeMB to say.. 1024 (1GB). Does this mean that on each indexer, the index can grow to a max of 1GB? (Meaning I'd be storing up to 3GB of data.) Or does that mean, across all the indexers it replicates to, the total sum is 1GB?

The main question though:

On a non-clustered setup, I can go to Indexes in the Manager and see the size of each index. And that was more or less the size of the index.

On a clustered view, I can see the index size from the Clustering view on the Master. Where does this number come from? Is that number the sum of the index size on all of the peers?

Is there a way I can see the size of an index on a peer by peer basis?

Tags (4)

Re: In a clustered environment, how do I figure out how much space an index is truly using?

Splunk Employee
Splunk Employee

Ricapar, each indexer in the cluster uses upto 1GB for that index. With 3 peers, that would be a max of 3GB in total across the cluster.

I believe the index size is the size of the compressed rawdata of all the buckets in the index where each bucket is counted only once (that is, copies are not counted). It doesn't include the size of hot buckets. So this is almost the actual size of all the compressed raw data in the index.

The UI gets it from master splunkd. The master knows the final size of a bucket when it rolls from hot to warm. (That's also why it doesn't know the size of hot buckets).

On the master's managment endpoint if you go to /services/cluster/master/bucket/ you should see a size listed if the bucket is a warm bucket. See:

View solution in original post