Deployment Architecture

Running "apply shcluster-bundle" in a search head cluster, why am I getting error "no captain found amongst members"?

rainerzufall
Path Finder

Hello,

Today I modified the /etc/system/local/authentication.conf file on all Search Head Cluster members because most settings should be pushed by the Deployer in a separate app. Authentication still is working fine (local and LDAP)...
Now, when I do a /opt/splunk/bin/splunk apply shcluster-bundle ... I get the following error:

Error while deploying apps to target=https://name.xyz:8089 with members=2: no captain found amongst members

The internal log is as follows:

127.0.0.1 - admin [02/Mar/2016:11:10:35.265 +0000] "POST /services/apps/deploy HTTP/1.1" 500 245 - - - 10529ms

But the SHC looks fine in the Distributed Management Console and I get the following output when checking cluster status on CLI:

name1# /opt/splunk/bin/splunk show shcluster-status

 Captain:
                          dynamic_captain : 1
                          elected_captain : Wed Mar  2 10:48:04 2016
                                       id : B2542A43-0D49-4235-ABAA-6749581BA6DC
                         initialized_flag : 1
                                    label : name1
                         maintenance_mode : 0
                                 mgmt_uri : https://name1.xyz:8089
                    min_peers_joined_flag : 1
                     rolling_restart_flag : 0
                       service_ready_flag : 1

 Members: 
        name2
                                    label : name2
                                 mgmt_uri : https://name2.xyz:8089
                           mgmt_uri_alias : https://1.1.1.2:8089
                                   status : Up
        name3
                                    label : name3
                                 mgmt_uri : https://name3.xyz:8089
                           mgmt_uri_alias : https://1.1.1.3:8089
                                   status : Up

Thanks,
/Rainer

0 Karma
1 Solution

muebel
SplunkTrust
SplunkTrust

Hi Rainer, based on your show cluster-status output, it looks like you are getting this message because the captain is actually not a member of the cluster. While name2 and name3 are present in the members list, name1 is not.

Additionally, I would specifically target the captain when running apply shcluster-bundle command. i.e. name1 instead of name.

I would try a restart of name1 and see if that prompts a re-election and hopefully have name1 join the cluster successfully. Otherwise it looks like you might have a deeper problem with the SHC that would require some assistance from support.

Please let me know if this helps!

View solution in original post

pfender
Explorer

I recently ran into the same issue, captain elected but missing in member list and didn't respond to other members anymore. A reboot helped, but not for long, cluster changed to unstabil pretty quick again. Started digging deeper and found the dispatch directory filling (+150k directories) and reaper didn't clean up, so I/O went up crazy. I identified a RT scheduled search causing splunk (6.5.5) keeping all the rt_scheduler__nobody* directories. A rewrite of the search fixed it. Cleaning the dispatch and cluster was running fine again. I afraid i spotted a possible bug in this version.

muebel
SplunkTrust
SplunkTrust

Hi Rainer, based on your show cluster-status output, it looks like you are getting this message because the captain is actually not a member of the cluster. While name2 and name3 are present in the members list, name1 is not.

Additionally, I would specifically target the captain when running apply shcluster-bundle command. i.e. name1 instead of name.

I would try a restart of name1 and see if that prompts a re-election and hopefully have name1 join the cluster successfully. Otherwise it looks like you might have a deeper problem with the SHC that would require some assistance from support.

Please let me know if this helps!

rainerzufall
Path Finder

I did a reboot of the complete box (splunk restart was not enough) and a new captain was elected. I now see all three nodes as cluster members. Thank aou for the hint!

0 Karma

muebel
SplunkTrust
SplunkTrust

Awesome, glad to hear! 😄

0 Karma

Raghav2384
Motivator

How many search heads do you have in total (including captain?). Is splunkd up on all of them?

Also, when you push authentication.conf, i am assuming you have a the strategy with BIND password on each and every search head as well correct? Sorry if i misread, reason i ask is , you cannot push one copy of LDAP strategy from Deployer where the password is already encrypted. It happened to me once during my new to SHC days.

And like @harsmarvania57 mentioned, name 1 should appear in members list as well.

Assuming you are on latest build, have you tried this
http://docs.splunk.com/Documentation/Splunk/6.3.3/DistSearch/Staticcaptain

Thanks,
Raghav

rainerzufall
Path Finder

There are 3 members total in the cluster and splunkd is up and running on all of them. LDAP config seems to be ok on all devices since I am able to login with the LDAP account when accessing the nodes directly.
I'll try the static captain thing...

0 Karma

rainerzufall
Path Finder

I tried the static captain configuration on the dynamic captain and got the following output:

 In handler 'shclusterconfig': Could not contact captain.  Check that the captain is up, the captain_uri=https://name1:8089 and secret are specified correctly Err : Failure, rc=2: Connect to=https://name1:8089 timed out; exceeded 30sec LowerLevelErrors = SocketError connecting to=name1:8089 WARN: Connect to=name1:8089 timed out; exceeded 30sec

It is extremely strange but after rebooting the complete box (splunkd restart was not enough), a new master was elected and now everything is fine...

0 Karma

harsmarvania57
Ultra Champion

Can you please check why members are showing "name2" and "name3" , "name1" must be in Members as well.

rainerzufall
Path Finder

name1 is not in the members list. How can I check why?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...