Deployment Architecture

Deployment Server Connection Errors

chris94089
Path Finder

Greetings,

I'm hosting a Splunk Deployment Server in a Kubernetes environment.  When I'm using one replica, connections are smooth.  But, when I use two or more, I get this error:

PubSubSvr - sender=connection_<my_client> channel=tenantService/handshake Message not dispatched (connection invalid)

The Server receives every phone home, but it doesn't do anything else.

What are some of the causes for this?

Labels (2)

isoutamo
SplunkTrust
SplunkTrust
Hi
If I understood right you try to use several DS? If so then check crossServerChecksum value on https://docs.splunk.com/Documentation/Splunk/7.3.3/Admin/Serverclassconf it must be true.
r. Ismo
0 Karma

chris94089
Path Finder

This response would be appropriate for situations like the deployment server constantly re-deploys its apps to clients over and over again.  Cross server checksum tells splunk to ignore the timestamp mismatch (I think, it's not really documented what exactly Cross server checksum does). 

That is not the case here.

The behavior I am asking about has to do with Splunk thinking that certain connections are invalid when splunk is placed behind a load balancer.  Even though curl works, I can access splunk web, etc.

Are there other settings one should configure for a multi Deployment Server?

0 Karma

ips_mandar
Builder

Hi @chris94089 , are you able to resolve these errors?  Can you provide some details since I am also getting same errors on deployment servers.

Thanks, 

0 Karma

chris94089
Path Finder

The only work around I was able to use involved spinning up a custom nginx load balancer instance and using the hash addr algorithm. 

I tried searching for the link for how to use hash_addr but I’m not finding it. It may have been updated. I’m no longer on the project and I can’t follow up if my specific work around is still working. 

The purpose of using hashing is because the same pair of DS and DC need at least two (or three) back to back phone homes to successfully enroll a new DC. After that a DC can get its updates from other DS’s placed behind a load balancer. 


The behavior goes like this, a new deployment client (DC) phones home and passes through a load balancer and hits deployment server 1 (DS1). Then it phones home again, but if load balancer routes to deployment server 2 (DC2) you’ll get an invalid error. Then the enrollment will just loop again. This behavior is for round robin algorithms. 

Phone homes don’t use sessions, so cookies won’t work either. So my approach was have the load balancer “remember” who was phoning home to who some other way. This seems only necessary during the initial enrollment attempts. After the DC has its apps it doesn’t care which DS it connects to for app updates. 

0 Karma

isoutamo
SplunkTrust
SplunkTrust
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

 Are you ready to revolutionize your IT operations? As digital transformation accelerates, the demand for ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...