Environment:
I've followed the documentation closely (or so I thought) - everything went as expected until Step 5:
On one of the search heads I try running this command:
splunk bootstrap shcluster-captain -servers_list "https://sslsplkshc01.stroock.com:8089,https://sslsplkshc02.stroock.com:8089,https://sslsplkshc03.stroock.com:8089" -auth admin:[achangedpassword]
The error I'm getting is:
In handler 'shclustermemberconsensus': Failed to Set Configuration. One potential reason is captain could not hear back from all the nodes in a timeout period. Ensure all to be added nodes are up, and increase the raft timeout. If all nodes are up and running, look at splunkd.log for appendEntries errors due to mgmt_uri mismatch
Initialization went fine, they restarted fine, i'm using the same URI's in this command as I did in the init command.
In splunkd.log on the instance I've tried this on - the pertinent errors appear to be:
11-01-2015 21:33:27.386 -0500 WARN SHPMasterHTTPProxy - Low Level http request failure err=failed method=POST path=/services/shcluster/captain/members/F7DCC5E0-922D-4906-AA24-20B3487AA6C3 captain=sslsplkshc01.stroock.com:8089 rc=0 actual_response_code=401 expected_response_code=200 status_line=Unauthorized error="<response>\n <messages>\n <msg type="WARN">call not properly authenticated</msg>\n </messages>\n</response>\n"
11-01-2015 21:33:28.724 -0500 ERROR SHPRaftConsensus - failed appendEntriesRequest err: error accessing https://sslsplkshc02.stroock.com:8089/services/shcluster/member/consensus/pseudoid/raft_append_entri..., statusCode=401, description=Unauthorized to https://sslsplkshc02.stroock.com:8089
11-01-2015 21:33:28.725 -0500 ERROR SHPRaftConsensus - failed appendEntriesRequest err: error accessing https://sslsplkshc03.stroock.com:8089/services/shcluster/member/consensus/pseudoid/raft_append_entri..., statusCode=401, description=Unauthorized to https://sslsplkshc03.stroock.com:8089
11-01-2015 21:33:32.388 -0500 WARN SHPMasterHTTPProxy - Low Level http request failure err=failed method=POST path=/services/shcluster/captain/members/F7DCC5E0-922D-4906-AA24-20B3487AA6C3 captain=sslsplkshc01.stroock.com:8089 rc=0 actual_response_code=401 expected_response_code=200 status_line=Unauthorized error="<response>\n <messages>\n <msg type="WARN">call not properly authenticated</msg>\n </messages>\n</response>\n"
11-01-2015 21:33:37.389 -0500 WARN SHPMasterHTTPProxy - Low Level http request failure err=failed method=POST path=/services/shcluster/captain/members/F7DCC5E0-922D-4906-AA24-20B3487AA6C3 captain=sslsplkshc01.stroock.com:8089 rc=0 actual_response_code=401 expected_response_code=200 status_line=Unauthorized error="<response>\n <messages>\n <msg type="WARN">call not properly authenticated</msg>\n </messages>\n</response>\n"
11-01-2015 21:34:28.710 -0500 INFO SHPRaftConsensus - Activating configuration 1:\n<configuration>\n<prev_configuration>\n<server>;\n<server_id>https://sslsplkshc01.stroock.com:8089</server_id>\n</server>\n</prev_configuration>\n<next_configuration>\n</next_configuration>\n</configuration>\n
11-01-2015 21:34:28.710 -0500 INFO SHPRaftConsensus - Exiting and deleting server : https://sslsplkshc02.stroock.com:8089
11-01-2015 21:34:28.710 -0500 INFO SHPRaftConsensus - Exiting and deleting server : https://sslsplkshc03.stroock.com:8089
11-01-2015 21:34:28.710 -0500 ERROR SHPRaftConsensus - Failed to Set Configuration. One potential reason is captain could not hear back from all the nodes in a timeout period. Ensure all to be added nodes are up, and increase the raft timeout. If all nodes are up and running, look at splunkd.log for appendEntries errors due to mgmt_uri mismatch
11-01-2015 21:34:32.406 -0500 WARN SHPMasterHTTPProxy - Low Level http request failure err=failed method=POST path=/services/shcluster/captain/members/F7DCC5E0-922D-4906-AA24-20B3487AA6C3 captain=sslsplkshc01.stroock.com:8089 rc=0 actual_response_code=401 expected_response_code=200 status_line=Unauthorized error="<response>\n <messages>\n <msg type="WARN">call not properly authenticated</msg>\n </messages>\n</response>\n"
Any help appreciated. From other discussions, things that may be pertinent, I'm not running any of these commands on the deployer. My admin password does special characters in it. I've tried surrounding it with quotes, single\double etc. same result. Also I am able to run this command:
splunk show shcluster-status -auth <username>:<password>
successfully so I don't think it's a credential issue, although that's what the errors sound like... The result of that is:
Captain:
dynamic_captain : 1
elected_captain : Sun Nov 01 21:16:11 2015
id : A3A44C35-B1C3-4723-959E-F5621F7883CF
initialized_flag : 0
label : SSLSPLKSHC01
maintenance_mode : 0
mgmt_uri : https://sslsplkshc01.stroock.com:8089
min_peers_joined_flag : 0
rolling_restart_flag : 0
service_ready_flag : 0
Turned out the splunk universal forwarder had been installed first and was using the default mgmt port. Changed ports, all worked as expected.
Turned out the splunk universal forwarder had been installed first and was using the default mgmt port. Changed ports, all worked as expected.
Just fixed the exact same issue, but for me it was a fat-fingers-typo in the -secret
of the bootstrap
command ... took my some time to find it.
cheers, MuS