I am trying to use invalid_replication_address to tell a cluster master running in front of an ELB to contact the indexer on a different address. However when i try to add the peer I get the following error on the CM:
REST_Calls - app=search POST cluster/master/peers/ id=526D8BF5-7412-4934-AC47-08C699290CC9: active_bundle_id -> [14310A4AABD23E85BBD4559C4A3B59F8], add_type -> [Initial-Add], base_generatio
n_id -> [0], batch_serialno -> [1], batch_size -> [2], buckets -> [], forwarderdata_rcv_port -> [9997], forwarderdata_use_ssl -> [0], indexes -> [], last_complete_generation_id -> [0], latest_bundle_id -> [14310A4AABD23E85BBD
4559C4A3B59F8], mgmt_port -> [8089], register_forwarder_address -> [], register_replication_address -> [https://10.0.7.181:8089], register_search_address -> [], replication_port -> [9887], replication_use_ssl -> [0], replicat
ions -> [], server_name -> [ip-10-0-7-181.ca-central-1.compute.internal], site -> [default], splunk_version -> [8.0.2], splunkd_build_number -> [a7f645ddaf91], status -> [Up]
INFO AdminManager - Setting capability.write=edit_indexer_cluster for handler clustermasterpeers.
INFO AdminManager - Setting capability.read=edit_indexer_cluster for handler clustermasterpeers.
DEBUG AdminManager - Validating argument values...
DEBUG AdminManagerValidation - Validating rule='validate(len(name) < 1024, 'Parameter "name" must be less than 1024 characters.')' for arg='name'.
ERROR ClusterMasterPeerHandler - Invalid host name https://10.0.7.181:8089
DEBUG AdminManager - URI /services/cluster/master/peers/?output_mode=json generated an AdminManagerExceptionBase exception in handler 'clustermasterpeers': Invalid host name https://10.0.7.181:80
89
INFO CMSlave - event=addPeer status=failure shutdown=false request: AddPeerRequest: { _id= _indexVec=''active_bundle_id=14310A4AABD23E85BBD4559C4A3B59F8 add_type=Initial-Add base_generation_id=0 batch_serialno=1 batch_size=2 forwarderdata_rcv_port=9997 forwarderdata_use_ssl=0 last_complete_generation_id=0 latest_bundle_id=14310A4AABD23E85BBD4559C4A3B59F8 mgmt_port=8089 name=526D8BF5-7412-4934-AC47
08C699290CC9 register_forwarder_address= register_replication_address=https://10.0.7.181:8089 register_search_address= replication_port=9887 replication_use_ssl=0 replications= server_name=ip-10-0-7-181.ca-central-1 compute.internal site=default splunk_version=8.0.2 splunkd_build_number=a7f645ddaf91 status=Up }
04-23-2020 02:03:56.478 +0000 ERROR CMSlave - event=addPeer start over and retry after sleep 12800ms reason addType=Initial Add Batch SN=1/2 failed. add_peer_network_ms=5
Notice how it says something regarding the name being less than 1024 characters and it possibly failing validation?
The Cluster Master can "resolve" the IP ..although its an IP so see no reason why it should resolve it although the "null" cant resolve is weird.. I added a hostfile..no diffference:
`
nslookup: can't resolve '(null)'
Name: 10.0.7.181
Address 1: 10.0.7.181 ip-10-0-7-181.ca-central-1.compute.internal
The Clustermaster can reach the Indexer on that port:
Ncat: Version 7.70 ( https://nmap.org/ncat )
Ncat: Connected to 10.0.7.181:8089.
`
Any reason why this happens?
I've read a few posts and register_replication_address seems to be the solution to my problem however i am unsure why it is "unable to resolve hostname"
*** UPDATE ***
I also want to add here i've been doing more testing on some nodes that are just two EC2 instances with all traffic allowed between each other. nslookup on AWS for the IPs are fine and I still cannot get this working. If i remove register_replication_address in these cases it will work fine..this is really weird. Im not sure what the issue is or how to troubleshoot further if the log just says "invalid hostname"
... View more