We are trying to move from single site to multisite splunk cluster. Although , its not clear how the SH clustering is supposed to work.
1. As per documentation, the recommended way is to have two separate SH clusters - But it doesn't look like we will have knowledge bundle (configs/user knowledge objects etc) replication between the two SH clusters formed. If this is the case then I don't get the point of suggesting multisite as a DR solution. When site1 fails, users connecting to site 2 wont have their knowledge objects and settings on the new SH cluster!?
2. The other thing thats suggested to have knowledge bundle and search artifact replication is to have a SH cluster spanning both the sites - BUT this also cant be suggested as a DR solution since in this case whenever the site with majority(or with same number) of SHs fail completely, the SH machines at the other site wont be able to form a cluster since they wont have majority. A work around is suggested here to deploy a static caption instead.
Thank you for the response. However, if we have single SHC, would search affinity work? As far as I understand, search affinity works with multisite enabled. So if multisite is not enabled and we have a single SHC, the SH machines may run searches over all indexer peers spanning multiple DCs, and this can be a performance concern if I am right?
multisite is indexer cluster feature not SHCs. Of course if your SHC is connected to multi site cluster then you must add this information to your SHC also. But if you have multiple DCs for indexer clusters then you should consider to use multi site indexer cluster instead of single site with peers on several DCs. Without multisite you couldn’t secure your data towards the loss of DC.
as @richgalloway said SHC is not a HA solution, it's just for availability. For better HA you must ensure that you have odd number of nodes in your SHC. The reason for that is RAFT protocol which are used to ensure that SHC is working and it can elect a captain when one (or more) node or site has lost. You should also ensure that at least one site has majority at least 51% of all nodes (not active nodes). In these cases it can automatically continue if another site has lost. And as you already mention when you lost that site which has majority of nodes then you can bring another node up with setting up static captain.