Why does a peer in an indexer cluster not have the...

bchau123 · ‎03-28-2017

I have a clustered environment with many indexers, but for some reason one of the indexers are not playing nice with the rest of them.

Any time I restart the CM (Cluster Master), this indexer is always missing from the dashboard until I restart it. The CM is 6.4.5 as well as this troubled indexer. All the other indexers are on 6.3.2.

Upon further investigation of the cluster-config, I noticed that this trouble server is missing the cluster_label. All documentation says that this is supposed to be set on the master node, but I cant for the life of me figure out how to get it to propagate to this machine.

This problem server also seems to be having issues with synchronizing its indexes with the rest of the cluster. When this indexer is in the cluster, there is always some small portion of search and replication factors not being met (just a few buckets seem to be off).

Has anyone ran into this issue before? I cant seem to find any information on these symptoms.

chrishartsock · ‎08-10-2017

Has upgrading to 6.6 fixed this issue for you?

gjanders · ‎08-10-2017

I have tested on a 6.6 indexer and I no longer see this message post-restart of an indexer...

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

gjanders · ‎04-02-2017

I've had a case open with Splunk support on this topic since June 2016, I've just got the answer that Splunk 6.6 will fix this.

Adding the cluster label did not fix the issue in my environment.

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

adonio · ‎03-28-2017

hi bchau123,
please read here: http://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/Systemrequirements
specially this: "There are strict version compatibility requirements between cluster nodes"

bchau123 · ‎03-28-2017

Hmm. This is a good read. Regarding my environment, Splunk Enterprise is installed on all machines and they are on the same network running the same OS. I do see the note further down regarding peer nodes need to run under the same version - down to the maintenance mode, but even further down, it says that mixed-version clusters are compatible from version 6.1+ on peers and 6.2 for the master. I have another environment that's running a similar splunk version disparity within the cluster (6.3.2 and 6.4.5 simultaneously - I am slowly migrating machines to 6.4.5 at the moment) that is not running into this problem.

adonio · ‎03-28-2017

try and put cluster peers in maintenance mode, then on the problem child, run this command:

splunk edit cluster-config -cluster_label <CLUSTER LABEL>

restart the peer and move out of maintenance mode

bchau123 · ‎03-28-2017

Ive actually tried this before, but after setting the cluster_label and then listing the cluster-config, there is still no cluster_label in there - Even on restart. I've also ensured that permissions aren't an issue on anything in the /opt/splunk directory.

I really appreciate the suggestions and assistance.

adonio · ‎03-28-2017

What errors or warning do you get when you search index = _internal sourcetype = splunkd host = badIndexer log_level = warn OR log_level = error ?

bchau123 · ‎03-28-2017

with this search, I get nothing. However, I dropped the "host=badIndexer" and I got TONS of "WARN ServerInfoHandler - Should not happen : Indexer cluster label should not be empty as it should default to CM's GUID" but that has already been established

adonio · ‎03-28-2017

when i put host=badindexer, i meant to replace "badIndexer" with your host that has 6.4.5 installed. can you share a message regarding the cluster label?
do you have cluster_label = <YourClusterLabel> on all Indexer Cluster Peers and Master?

bchau123 · ‎03-28-2017

... yes that makes perfect sense. I replaced "badIndexer" with the problem child's hostname and I have tons of results saying "WARN ServerInfoHandler - Should not happen : Indexer cluster label should not be empty as it should default to CM's GUID."

The Master and all the other peers have the correct cluster_label in their cluster-config when I list them. It's just the problem server that seems to refuse to accept that setting.

adonio · ‎03-28-2017

is it single site or multisite?

bchau123 · ‎03-28-2017

this is a single site cluster.

adonio · ‎03-28-2017

can you share server.conf of the bad indexer?

bchau123 · ‎03-28-2017

The server.conf on the bad indexer is identical to the other indexers (except for serverName under [general] referencing itself).

Of what may be of interest, the [clustering] stanza lists only the master_uri and mode=slave. This is consistent between all indexers in the cluster.

adonio · ‎03-28-2017

do you have site = site<n> under default or under general?
do you use pass4symmkey?

Why does a peer in an indexer cluster not have the "cluster_label" label?

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

Are you a member of the Splunk Community?

Why does a peer in an indexer cluster not have the "cluster_label" label?

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...