Solved: Re: Running "apply shcluster-bundle" in a search h...

rainerzufall · ‎03-02-2016

Hello,

Today I modified the /etc/system/local/authentication.conf file on all Search Head Cluster members because most settings should be pushed by the Deployer in a separate app. Authentication still is working fine (local and LDAP)...
Now, when I do a /opt/splunk/bin/splunk apply shcluster-bundle ... I get the following error:

Error while deploying apps to target=https://name.xyz:8089 with members=2: no captain found amongst members

The internal log is as follows:

127.0.0.1 - admin [02/Mar/2016:11:10:35.265 +0000] "POST /services/apps/deploy HTTP/1.1" 500 245 - - - 10529ms

But the SHC looks fine in the Distributed Management Console and I get the following output when checking cluster status on CLI:

name1# /opt/splunk/bin/splunk show shcluster-status

 Captain:
                          dynamic_captain : 1
                          elected_captain : Wed Mar  2 10:48:04 2016
                                       id : B2542A43-0D49-4235-ABAA-6749581BA6DC
                         initialized_flag : 1
                                    label : name1
                         maintenance_mode : 0
                                 mgmt_uri : https://name1.xyz:8089
                    min_peers_joined_flag : 1
                     rolling_restart_flag : 0
                       service_ready_flag : 1

 Members: 
        name2
                                    label : name2
                                 mgmt_uri : https://name2.xyz:8089
                           mgmt_uri_alias : https://1.1.1.2:8089
                                   status : Up
        name3
                                    label : name3
                                 mgmt_uri : https://name3.xyz:8089
                           mgmt_uri_alias : https://1.1.1.3:8089
                                   status : Up

Thanks,
/Rainer

muebel · ‎03-02-2016

Hi Rainer, based on your show cluster-status output, it looks like you are getting this message because the captain is actually not a member of the cluster. While name2 and name3 are present in the members list, name1 is not.

Additionally, I would specifically target the captain when running apply shcluster-bundle command. i.e. name1 instead of name.

I would try a restart of name1 and see if that prompts a re-election and hopefully have name1 join the cluster successfully. Otherwise it looks like you might have a deeper problem with the SHC that would require some assistance from support.

Please let me know if this helps!

View solution in original post

pfender · ‎06-12-2018

I recently ran into the same issue, captain elected but missing in member list and didn't respond to other members anymore. A reboot helped, but not for long, cluster changed to unstabil pretty quick again. Started digging deeper and found the dispatch directory filling (+150k directories) and reaper didn't clean up, so I/O went up crazy. I identified a RT scheduled search causing splunk (6.5.5) keeping all the rt_scheduler__nobody* directories. A rewrite of the search fixed it. Cleaning the dispatch and cluster was running fine again. I afraid i spotted a possible bug in this version.

muebel · ‎03-02-2016

Hi Rainer, based on your show cluster-status output, it looks like you are getting this message because the captain is actually not a member of the cluster. While name2 and name3 are present in the members list, name1 is not.

Additionally, I would specifically target the captain when running apply shcluster-bundle command. i.e. name1 instead of name.

I would try a restart of name1 and see if that prompts a re-election and hopefully have name1 join the cluster successfully. Otherwise it looks like you might have a deeper problem with the SHC that would require some assistance from support.

Please let me know if this helps!

rainerzufall · ‎03-03-2016

I did a reboot of the complete box (splunk restart was not enough) and a new captain was elected. I now see all three nodes as cluster members. Thank aou for the hint!

muebel · ‎03-03-2016

Awesome, glad to hear! 😄

Raghav2384 · ‎03-02-2016

How many search heads do you have in total (including captain?). Is splunkd up on all of them?

Also, when you push authentication.conf, i am assuming you have a the strategy with BIND password on each and every search head as well correct? Sorry if i misread, reason i ask is , you cannot push one copy of LDAP strategy from Deployer where the password is already encrypted. It happened to me once during my new to SHC days.

And like @harsmarvania57 mentioned, name 1 should appear in members list as well.

Assuming you are on latest build, have you tried this
http://docs.splunk.com/Documentation/Splunk/6.3.3/DistSearch/Staticcaptain

Thanks,
Raghav

rainerzufall · ‎03-03-2016

There are 3 members total in the cluster and splunkd is up and running on all of them. LDAP config seems to be ok on all devices since I am able to login with the LDAP account when accessing the nodes directly.
I'll try the static captain thing...

rainerzufall · ‎03-03-2016

I tried the static captain configuration on the dynamic captain and got the following output:

 In handler 'shclusterconfig': Could not contact captain.  Check that the captain is up, the captain_uri=https://name1:8089 and secret are specified correctly Err : Failure, rc=2: Connect to=https://name1:8089 timed out; exceeded 30sec LowerLevelErrors = SocketError connecting to=name1:8089 WARN: Connect to=name1:8089 timed out; exceeded 30sec

It is extremely strange but after rebooting the complete box (splunkd restart was not enough), a new master was elected and now everything is fine...

harsmarvania57 · ‎03-02-2016

Can you please check why members are showing "name2" and "name3" , "name1" must be in Members as well.

rainerzufall · ‎03-03-2016

name1 is not in the members list. How can I check why?

Running "apply shcluster-bundle" in a search head cluster, why am I getting error "no captain found amongst members"?

Accelerating Observability as Code with the Splunk AI Assistant

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

Congratulations to the 2025-2026 SplunkTrust!

Join the Conversation

Running "apply shcluster-bundle" in a search head cluster, why am I getting error "no captain found amongst members"?

Accelerating Observability as Code with the Splunk AI Assistant

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

Congratulations to the 2025-2026 SplunkTrust!