I have a clustered environment with many indexers, but for some reason one of the indexers are not playing nice with the rest of them.
Any time I restart the CM (Cluster Master), this indexer is always missing from the dashboard until I restart it. The CM is 6.4.5 as well as this troubled indexer. All the other indexers are on 6.3.2.
Upon further investigation of the cluster-config, I noticed that this trouble server is missing the cluster_label. All documentation says that this is supposed to be set on the master node, but I cant for the life of me figure out how to get it to propagate to this machine.
This problem server also seems to be having issues with synchronizing its indexes with the rest of the cluster. When this indexer is in the cluster, there is always some small portion of search and replication factors not being met (just a few buckets seem to be off).
Has anyone ran into this issue before? I cant seem to find any information on these symptoms.
I have tested on a 6.6 indexer and I no longer see this message post-restart of an indexer...
I've had a case open with Splunk support on this topic since June 2016, I've just got the answer that Splunk 6.6 will fix this.
Adding the cluster label did not fix the issue in my environment.
Hmm. This is a good read. Regarding my environment, Splunk Enterprise is installed on all machines and they are on the same network running the same OS. I do see the note further down regarding peer nodes need to run under the same version - down to the maintenance mode, but even further down, it says that mixed-version clusters are compatible from version 6.1+ on peers and 6.2 for the master. I have another environment that's running a similar splunk version disparity within the cluster (6.3.2 and 6.4.5 simultaneously - I am slowly migrating machines to 6.4.5 at the moment) that is not running into this problem.
try and put cluster peers in maintenance mode, then on the problem child, run this command:
splunk edit cluster-config -cluster_label <CLUSTER LABEL>
restart the peer and move out of maintenance mode
Ive actually tried this before, but after setting the cluster_label and then listing the cluster-config, there is still no cluster_label in there - Even on restart. I've also ensured that permissions aren't an issue on anything in the /opt/splunk directory.
I really appreciate the suggestions and assistance.
with this search, I get nothing. However, I dropped the "host=badIndexer" and I got TONS of "WARN ServerInfoHandler - Should not happen : Indexer cluster label should not be empty as it should default to CM's GUID" but that has already been established
when i put host=badindexer, i meant to replace "badIndexer" with your host that has 6.4.5 installed. can you share a message regarding the cluster label?
do you have
cluster_label = <YourClusterLabel> on all Indexer Cluster Peers and Master?
... yes that makes perfect sense. I replaced "badIndexer" with the problem child's hostname and I have tons of results saying "WARN ServerInfoHandler - Should not happen : Indexer cluster label should not be empty as it should default to CM's GUID."
The Master and all the other peers have the correct cluster_label in their cluster-config when I list them. It's just the problem server that seems to refuse to accept that setting.
The server.conf on the bad indexer is identical to the other indexers (except for serverName under [general] referencing itself).
Of what may be of interest, the [clustering] stanza lists only the master_uri and mode=slave. This is consistent between all indexers in the cluster.