I've recently been advised that our organization is intending to do away with the production domain where our current Splunk cluster resides, and move everything over two the other domain in use. This implementation does currently have nodes in two different domains, and the domain to go away happens to house both our Cluster Manager and four indexers in a two-site configuration running Splunk Enterprise 9.1.1.
I don't yet have all the details (ie, is the IP/hostname changing or not) but in an effort to do some pre-emptive housecleaning and change the 'serverName' on one of the indexers in advance to go from FQDN to just the hostname, I got CM complaints that it couldn't rejoin the cluster due to the GUID belonging to another indexer.
01-16-2024 13:43:03.307 +0000 ERROR ClusterMasterPeerHandler [25028 TcpChannelThread] - Cannot add peer=X.X.X.X mgmtport=8089 (reason: Peer with guid=<GUID> is already registered and UP).
This error feels a little bit like a chicken/egg situation. Essentially I just had put the CM into maintenance-mode, stopped the peer, updated serverName in server.conf and started it back up. Perhaps I should have used 'splunk offline' vs 'splunk stop' here?
This has me thinking the operation we're about to undertake is a fairly complex one. I haven't been able to find any relatively recent posts about doing something similar aside from a 2016 blog post that makes no mention of GUID and presume it was referring to stand-alone indexers vs clustered. Changing the GUID is presumably a non-starter due to the existing buckets all referencing it in their names...
Long story short, I'm looking for an order of operations and some dos/donts for an undertaking like this.