Hello,
needs to remive a site in a three sites cluster. Following the instructions in https://docs.splunk.com/Documentation/Splunk/8.2.0/Indexer/Decommissionasite, and resuming the commands as follow:
- Check if cluster is in complete state
- Move Manager away from the decommissioned site
- Remove the peers in decommissioned site as receivers for UF
- Enter in maintenance mode
- Modify server.conf (manager node )
from: available_sites = site1, site2, site3 to: available_sites = site1, site2
from: site_replication_factor = origin:1,site1:1,site2:1,site3:1,total:3 to: site_replication_factor = origin:2,total:3
from: site_search_factor = origin:1, total:2 to: site_search_factor = origin:1,total:2
add: site_mappings = site3:site1
- Restart the manager
- Disable maintenance mode
- Stop splunk on each peer on the decommissioned site
- Waiting the cluster back in complete state
- Remove peers
How can i verify if all is gone as expected ?
check buckets, query ...
Thanks
To reduce the site RF you must also reduce replication_factor.
Data on the dismissed site will not be migrated because it's already copied on each of the other two sites. That's a result of your site_replication_factor setting.
When the site is decommissioned, the Cluster Manager will ensure a primary bucket exists somewhere in the cluster. Once the CM says the search and replication factors are met then you are done.
Your plan seems sound, although I would change the site RF to 2:2. Using 2:3 means Splunk will store two copies of your data on one site (not always the same site) so you'll need additional storage. RF of 2:2 keeps your current storage usage.
thanks for the answer.
Setting RF 2:2 i get this error on splunkd.log
03-28-2023 12:47:53.255 +0700 ERROR ClusteringMgr - Failure to load cluster config (server.conf) Error = site_replication_factor={ origin:1, total:2 } is less than replication_factor=3.
03-28-2023 12:47:53.256 +0700 ERROR loader - clustering initialization failed; won't start splunkd
Kindly can suggest me how to verify if an originating event from one of the peer in the dismissed site has been migrated. Using SPL with a simple query like "index=blabla "*<old-peer-name>*" doesn't give result.
Regards
To reduce the site RF you must also reduce replication_factor.
Data on the dismissed site will not be migrated because it's already copied on each of the other two sites. That's a result of your site_replication_factor setting.
When the site is decommissioned, the Cluster Manager will ensure a primary bucket exists somewhere in the cluster. Once the CM says the search and replication factors are met then you are done.