Deployment Architecture

Running "apply shcluster-bundle" in a search head cluster, why am I getting error "no captain found amongst members"?

Path Finder

Hello,

Today I modified the /etc/system/local/authentication.conf file on all Search Head Cluster members because most settings should be pushed by the Deployer in a separate app. Authentication still is working fine (local and LDAP)...
Now, when I do a /opt/splunk/bin/splunk apply shcluster-bundle ... I get the following error:

Error while deploying apps to target=https://name.xyz:8089 with members=2: no captain found amongst members

The internal log is as follows:

127.0.0.1 - admin [02/Mar/2016:11:10:35.265 +0000] "POST /services/apps/deploy HTTP/1.1" 500 245 - - - 10529ms

But the SHC looks fine in the Distributed Management Console and I get the following output when checking cluster status on CLI:

name1# /opt/splunk/bin/splunk show shcluster-status

 Captain:
                          dynamic_captain : 1
                          elected_captain : Wed Mar  2 10:48:04 2016
                                       id : B2542A43-0D49-4235-ABAA-6749581BA6DC
                         initialized_flag : 1
                                    label : name1
                         maintenance_mode : 0
                                 mgmt_uri : https://name1.xyz:8089
                    min_peers_joined_flag : 1
                     rolling_restart_flag : 0
                       service_ready_flag : 1

 Members: 
        name2
                                    label : name2
                                 mgmt_uri : https://name2.xyz:8089
                           mgmt_uri_alias : https://1.1.1.2:8089
                                   status : Up
        name3
                                    label : name3
                                 mgmt_uri : https://name3.xyz:8089
                           mgmt_uri_alias : https://1.1.1.3:8089
                                   status : Up

Thanks,
/Rainer

0 Karma
1 Solution

SplunkTrust
SplunkTrust

Hi Rainer, based on your show cluster-status output, it looks like you are getting this message because the captain is actually not a member of the cluster. While name2 and name3 are present in the members list, name1 is not.

Additionally, I would specifically target the captain when running apply shcluster-bundle command. i.e. name1 instead of name.

I would try a restart of name1 and see if that prompts a re-election and hopefully have name1 join the cluster successfully. Otherwise it looks like you might have a deeper problem with the SHC that would require some assistance from support.

Please let me know if this helps!

View solution in original post

Explorer

I recently ran into the same issue, captain elected but missing in member list and didn't respond to other members anymore. A reboot helped, but not for long, cluster changed to unstabil pretty quick again. Started digging deeper and found the dispatch directory filling (+150k directories) and reaper didn't clean up, so I/O went up crazy. I identified a RT scheduled search causing splunk (6.5.5) keeping all the rt_scheduler__nobody* directories. A rewrite of the search fixed it. Cleaning the dispatch and cluster was running fine again. I afraid i spotted a possible bug in this version.

0 Karma

SplunkTrust
SplunkTrust

Hi Rainer, based on your show cluster-status output, it looks like you are getting this message because the captain is actually not a member of the cluster. While name2 and name3 are present in the members list, name1 is not.

Additionally, I would specifically target the captain when running apply shcluster-bundle command. i.e. name1 instead of name.

I would try a restart of name1 and see if that prompts a re-election and hopefully have name1 join the cluster successfully. Otherwise it looks like you might have a deeper problem with the SHC that would require some assistance from support.

Please let me know if this helps!

View solution in original post

Path Finder

I did a reboot of the complete box (splunk restart was not enough) and a new captain was elected. I now see all three nodes as cluster members. Thank aou for the hint!

0 Karma

SplunkTrust
SplunkTrust

Awesome, glad to hear! 😄

0 Karma

Motivator

How many search heads do you have in total (including captain?). Is splunkd up on all of them?

Also, when you push authentication.conf, i am assuming you have a the strategy with BIND password on each and every search head as well correct? Sorry if i misread, reason i ask is , you cannot push one copy of LDAP strategy from Deployer where the password is already encrypted. It happened to me once during my new to SHC days.

And like @harsmarvania57 mentioned, name 1 should appear in members list as well.

Assuming you are on latest build, have you tried this
http://docs.splunk.com/Documentation/Splunk/6.3.3/DistSearch/Staticcaptain

Thanks,
Raghav

Path Finder

There are 3 members total in the cluster and splunkd is up and running on all of them. LDAP config seems to be ok on all devices since I am able to login with the LDAP account when accessing the nodes directly.
I'll try the static captain thing...

0 Karma

Path Finder

I tried the static captain configuration on the dynamic captain and got the following output:

 In handler 'shclusterconfig': Could not contact captain.  Check that the captain is up, the captain_uri=https://name1:8089 and secret are specified correctly Err : Failure, rc=2: Connect to=https://name1:8089 timed out; exceeded 30sec LowerLevelErrors = SocketError connecting to=name1:8089 WARN: Connect to=name1:8089 timed out; exceeded 30sec

It is extremely strange but after rebooting the complete box (splunkd restart was not enough), a new master was elected and now everything is fine...

0 Karma

SplunkTrust
SplunkTrust

Can you please check why members are showing "name2" and "name3" , "name1" must be in Members as well.

Path Finder

name1 is not in the members list. How can I check why?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!