Deployment Architecture

Deployment Server Connection Errors

chris94089
Path Finder

Greetings,

I'm hosting a Splunk Deployment Server in a Kubernetes environment.  When I'm using one replica, connections are smooth.  But, when I use two or more, I get this error:

PubSubSvr - sender=connection_<my_client> channel=tenantService/handshake Message not dispatched (connection invalid)

The Server receives every phone home, but it doesn't do anything else.

What are some of the causes for this?

Labels (2)

soutamo
SplunkTrust
SplunkTrust
Hi
If I understood right you try to use several DS? If so then check crossServerChecksum value on https://docs.splunk.com/Documentation/Splunk/7.3.3/Admin/Serverclassconf it must be true.
r. Ismo
0 Karma

chris94089
Path Finder

This response would be appropriate for situations like the deployment server constantly re-deploys its apps to clients over and over again.  Cross server checksum tells splunk to ignore the timestamp mismatch (I think, it's not really documented what exactly Cross server checksum does). 

That is not the case here.

The behavior I am asking about has to do with Splunk thinking that certain connections are invalid when splunk is placed behind a load balancer.  Even though curl works, I can access splunk web, etc.

Are there other settings one should configure for a multi Deployment Server?

0 Karma

ips_mandar
Builder

Hi @chris94089 , are you able to resolve these errors?  Can you provide some details since I am also getting same errors on deployment servers.

Thanks, 

0 Karma

chris94089
Path Finder

The only work around I was able to use involved spinning up a custom nginx load balancer instance and using the hash addr algorithm. 

I tried searching for the link for how to use hash_addr but I’m not finding it. It may have been updated. I’m no longer on the project and I can’t follow up if my specific work around is still working. 

The purpose of using hashing is because the same pair of DS and DC need at least two (or three) back to back phone homes to successfully enroll a new DC. After that a DC can get its updates from other DS’s placed behind a load balancer. 


The behavior goes like this, a new deployment client (DC) phones home and passes through a load balancer and hits deployment server 1 (DS1). Then it phones home again, but if load balancer routes to deployment server 2 (DC2) you’ll get an invalid error. Then the enrollment will just loop again. This behavior is for round robin algorithms. 

Phone homes don’t use sessions, so cookies won’t work either. So my approach was have the load balancer “remember” who was phoning home to who some other way. This seems only necessary during the initial enrollment attempts. After the DC has its apps it doesn’t care which DS it connects to for app updates. 

0 Karma

soutamo
SplunkTrust
SplunkTrust
0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!