As per http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Handlemasternodefailure
I'm using ELB to front the master.
The issue that I'm hitting is that the master will see the peers coming from the ELBs interface. That is the master is associating the ELBs interface address with the peer and not looking for peers actual IP address. I have the ELB listener setup to https so ELB is setting the x-forwarded-for header which should contain the peer's actual address as per AWS docs.
The problem occurs when index peers hit the same the ELB node and the master sees both using the same IP address and rejects the second peers with a message like...
Search peer A.B.C.D has the following message: Failed to add peer 'guid=46410D7B-5283-4A25-9BC6-993D67E1E21F server name=A.B.C.D ip=W.X.Y.Z:8089' to the master. Error=Peer with hostport=W.X.Y.Z:8089 is already registered and UP.
A.B.C.D is the peers address
W.X.Y.Z is the ELB interface
My understanding is that peers talk to master over HTTPS but I'm guessing the master doesn't support the x-forwarded-for header
Is there any way to set 'hostport' or is that derived from the source address field of the TCP session?
Has anyone gotten this sort of setup to work?
we can use register_replication_address to get this to work
1885 register_replication_address = <IP address, or fully qualified machine/domain name>$
1886 * Only valid for mode=slave$
1887 * This is the address on which a slave will be available for accepting$
1888 replication data. This is useful in the cases where a slave host machine$
1889 has multiple interfaces and only one of them can be reached by another$
1890 splunkd instance$
this setting will have the Cluster Master communicate with the indexer through a specific hostname / IP address. Without it, the Cluster Master guesses what the IP is (and in this case guesses the ELB IP)
I know this is an old comment but i have the same situation. Indexer on EC2, CM with ELB infront. Im trying to use register_replication_address however I get "Invalid Hostname" when trying to add the node to the cluster. It's weird cause both boxes can talk to each other..
Have you validated name resolution between the indexers and the CM? The CM needs to be able to resolve the NAME/IP of the indexers and the indexers need to be able to resolve and communicate to IP/NAme of the ELB.
This is where teh register_*_address settings come into play. Make sure these are mapped correctly. Dig/Nslookup is your friend here.
I have actually.
So from the CM i am resolve the ip of the Indexer and vice versa:
Note i did notice the "null" problem but its not really a problem it still resolves and i looked it up its for alot of reasons..
CM:
`
nslookup w.x.y.z
nslookup: can't resolve '(null)'
Name: w.x.y.z
Address 1: w.x.y.z ip-w-x-y-z.ca-central-1.compute.internal
nslookup a.b.c.d
Indexer:
a.b.c.d.in-addr.arpa name = ip-a-b-c-d.ca-central-1.compute.internal
`
I have resorted to pointing directly via IP now because at this point i am testing out register_replication_address. I have tested things like tcpdump traffic from the CM to the indexer and I am not even seeing the CM reach out to the indexer on that IP..So confused
I have a question here if you want more details: https://answers.splunk.com/answers/818038/cluster-master-register-replication-address-invali.html
So through just nslookup I am able to "resolve" the IP of the indexer and CM vice versa:
CM:
`nslookup: can't resolve '(null)'
Name: w.x.y.z
Address 1: w.x.y.z ip-w-x-y-z.ca-central-1.compute.internal
Indexer:
nslookup x.x.x.x
x.x.x.x.in-addr.arpa name = ip-x-x-x-x.ca-central-1.compute.internal.
Authoritative answers can be found from:
`
I've actually resorted here to going to the IP for the CM to test out the register_replication_address which does not seem to work.. i posted more details here if you can take a look..https://answers.splunk.com/answers/818038/cluster-master-register-replication-address-invali.html
At this point im just trying to get replication_address working then move forward..
I got this working with a Network Load Balancer as opposed to a classic ELB. Here's a cloudformation template snippet I used to set it up:
CMRoute53:
Type: AWS::Route53::RecordSet
Properties:
AliasTarget:
DNSName: !GetAtt [CMLoadBalancer, DNSName]
HostedZoneId: <NLB hosted zone id from https://docs.aws.amazon.com/general/latest/gr/rande.html#elb_region>;
Comment: DNS Record for the Cluster Master (for indexer discovery)
HostedZoneName: 'location.net.'
Name: 'master.location.net'
Type: A
CMLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Type: network
Scheme: internal
...
CMTargetGroup8089:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
HealthCheckPort: 8089
HealthCheckProtocol: TCP
Port: 8089
Protocol: TCP
VpcId: <vpcid>
CMListener8089:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- TargetGroupArn: !Ref CMTargetGroup8089
Type: forward
LoadBalancerArn: !Ref CMLoadBalancer
Port: 8089
Protocol: TCP
CMAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
...
TargetGroupARNs:
- !Ref CMTargetGroup8089