Obviously, this is a complex task, please only respond if you have high confidence in the nature of the error I'm receiving. I don't want to go on a wild goose chase. Version 6.6.2.
I'm setting up a new multisite indexing cluster (I've done this before during a professional services engagement), and following Splunk docs very closely on setting up clusters and multisite clusters. I've fully read all the docs related to these topics several times over, and feel I have a very high understanding of the tasks to be completed.
However, I'm running into an error which is not allowing the cluster peers to start. I will post the error at the end, due to it's length.
I've configured the master node: http://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Configuremasterwithserverconf
and
http://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Multisiteconffile
The master node is online and waiting for the cluster peers (indexers) to come online, just as the documentation said it would. I've also configured the peer nodes: http://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Configurepeerswithserverconf
and
http://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Multisiteconffile
Now, when I attempt to start the peer nodes, I get the errors in splunkd.log, and splunkd won't start.
I've attempted many ways to define repFactor=auto or repFactor=0
, but really the error makes no sense to me. Following the directions in the error has not made any difference. The same error occurs if master-apps on the cluster master is empty, or has appropriate indexes.conf files.
Thanks for any help.
Error when attempting to start cluster peer node (indexer):
02-21-2018 11:15:36.652 -0500 ERROR CMBundleMgr - Download bundle failed, err="App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication."
02-21-2018 11:15:37.652 -0500 ERROR CMSlave - event=getActiveBundle failed with err="App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication." even after multiple attempts, Exiting..
02-21-2018 11:15:37.653 -0500 ERROR loader - Failed to download bundle from master, err="App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.,App='system' with replicated index='_introspection' is neither in the bundle downloaded from master nor managed by local deployment client. Either define this index at the master or specify repFactor=0 on peer to skip replication.", Won't start splunkd.
Long story short, I had to generate a new bundle on the cluster master, then everything worked as intended.
Next crisis...
I've attempted to provide an exhaustive answer to this at What is the best practice to address the "..is neither in the bundle downloaded from master nor mana...
vi /opt/splunk/etc/apps/org_cluster_indexer_base/local/indexes.conf
[default]
repFactor = auto
I had to comment the above stanza and attribute individually on both indexers one by one and they stopped throwing the error.
Long story short, I had to generate a new bundle on the cluster master, then everything worked as intended.
Next crisis...
i filed an internal bug SPL-151123, and I think the steps to get here were:
1) push a bundle with a global repFactor=auto
2) try to push a bundle after removing the global repFactor=auto
anyways, a simple workaround is to push the bundle with skip validation
bin/splunk apply cluster-bundle --skip-validation