Hi,
We are planning on upgrading from 6.4.1 to 6.5.3, and moving from 4 seperate clusters to one cluster, with multi-site clustering. My current setup has a replication factor and search factor of 2 on each cluster master.
My new setup, which I'm trying to test in dev, has 3 sites, and my server.conf has the following, but I'm getting "replication factor is not met" in the Cluster Master. Can someone help me understand what I'm missing? Took this class ages ago, and surely forgetting something...
available_sites = site1,site2,site3
mode = master
multisite = true
pass4SymmKey = *******
site_replication_factor = origin:1,total:3
site_search_factor = origin:1,total:3
cluster_label = DEV_SPLUNK
Honestly, I would recommend breaking these into two different activities just to simplify your activities:
You have 4 clusters across 3 sites.
Start by switching them to all be one the same single site cluster. Even though they are technically in different data centers - Splunk doesn't know that unless you tell it with multisite. So now you have all indexers reporting to the same Master Node, all having the same indexes.conf and other settings (no pesky system/local stuff messing with you), and all forwarders sending data to all of those indexers (or at least the ones geographically close to them). For safety, I'd rock and roll with a rep factor of 4 (so 1 more than the number of data centers so you mitigate this migration risk should you lose a site) and even just 1 or 2 search factor.
Once that is working with a basic clustering and all systems are green, then explore the transition to multisite. You can even start with a baby step and just have a single site with multisite config. At that point, you'll want to neuter the old single site settings (as highlighted by @rtacy) and implement your multisite config. Remember that with multisite, you can save $$$ on storage by needing less copies (2 or 3 instead of 4) AND you can do active cluster upgrades AND data center migrations. Like with regular clustering, only data indexed after multisite is enabled with be multisite replicable (no retroactive).
If you haven't already, this is a great time to make sure you have a DNS alias for the master node so if you lose it you can stand a new one up rapidly without having to repoint all the indexers.
The single site replication factor (default 3) and single site search factor (default 2) are still active when you configure a multi site cluster. I suspect that unless you have at least 3 peers in each site you'll get the errors that you're seeing. Consider setting both replication_factor and search_factor to 1 if you don't have enough peers. If you're merging your separate clusters into one cluster, consider that replication_factor and search_factor will apply to the single site buckets that you bring over.
Thanks. Isn't that what I have already set?
site_replication_factor = origin:1,total:3
site_search_factor = origin:1,total:3
Yes, you set a custom site_replication_factor
and site_search_factor
but the default values for replication_factor
(3) and search_factor
(2) may be preventing your cluster from being complete. If you're experimenting and you only have one peer in each site, consider using the following config to avoid the message:
site_replication_factor = origin:1,total:3
site_search_factor = origin:1,total:3
replication_factor = 1
search_factor = 1
Thanks. I actually have 2 indexers in each site.
2 indexers in each site isn't enough to use the defaults. You'll at least need the following config:
site_replication_factor = origin:1,total:3
site_search_factor = origin:1,total:3
replication_factor = 2
Bingo. @jtacy is spot on. In my lab, I even have those single site values set to 1 just to get them out of the way:
replication_factor = 1
search_factor = 1
Also, don't forget to review the topic on this activity: http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Migratetomultisite
do you have the repFactor = auto on indexes.conf?
https://docs.splunk.com/Documentation/Splunk/6.6.0/Admin/Indexesconf