Deployment Architecture
Highlighted

How to reduce the replication factor in a multisite indexer cluster to keep indexers' disk space from filling up?

Splunk Employee
Splunk Employee

We have a multisite indexer cluster running Splunk 6.2.7. with 40 indexers.
Many of the indexers have their data partitions over 90% full.
We are trying to clear up disk space in order to avoid catastrophic outage of the indexer cluster.
Ultimately, we are planning to add new indexers to the indexing pool, but we need an interim solution to buy us time.

Our primary site is site 2 while our DR site is site 1.
We want to reduce the replication factor by one.
The current replication settings on the cluster master is below:

[clustering]
mode = master
multisite = true
available_sites = site1,site2
site_replication_factor = origin:2,site2:2,total:4
site_search_factor = origin:1,site2:2,total:3

So, if we need to reduce the replication factor by 1 and we want more copies in our primary than our DR site, then we should change the line on the cluster master to:

sitereplicationfactor = origin:1,site2:2,total:3 <--Correct?

After the config change to server.conf on the cluster master, what is the procedure to apply and remove the excess buckets?

Is it:
* server.conf change on cluster master
* restart splunk on cluster master to apply change
* Click "Remove All Excess Buckets" button in Bucket Status view on the cluster master ui
* Wait for excess buckets to be deleted

Are these the recommended steps?

0 Karma
Highlighted

Re: How to reduce the replication factor in a multisite indexer cluster to keep indexers' disk space from filling up?

Splunk Employee
Splunk Employee

The setting you are planning to apply will help reduce some amount of disk utilization. To make this change you need to follow following steps.

1)On the Cluster Master make changes to server.conf ( this change needs Cluster Masters restart )
2) After that you can use the remove excess command as listed in http://docs.splunk.com/Documentation/Splunk/6.1.3/Indexer/Removeextrabucketcopies
3) My advice will be to be selective - Do few index at a time and wait cluster to meet RF and SF and perform others.

Here are Bugs around "remove excess-buckets" that you might be interested in.

SPL-108023/ SPL-98101 :[Clustering] "remove excess-buckets" removes all buckets created under Multi-Site clustering if CM was moved from Multisite to Single Site. ( Reported on : 6.2.4, 6.2.6 and fixed on 6.2.7 and 6.3)
SPL-90409 :remove excess buckets does not remove all the excess buckets it should and causes "fully searchable" criteria in UI to fail ( This is not a majore buck: remove-excess-buckets may cause all-searchable to become not-all-searchable, but it should quickly return to all-searchable) - fixed in 6.3
SPL-106614 :remove excess bucket doesn't remove summary files (this bug is still pending) SPL-90986:splunk list excess-buckets only lists 30 indexes (Reported on 6.1.3 and Resolved in 6.3) SPL-98007:Clicking "remove" button in the cluster management excess buckets page deletes all standalone buckets (Reported on 6.2.1, 6.2.2 and fixed in 6.2.3)

In Splunk Version 6.3 a new Capability has been added that could help you manage such a situation better.Refer : https://answers.splunk.com/answers/331293/if-two-cluster-peers-in-an-indexer-cluster-are-get.html

View solution in original post