Getting Data In

Search head pooling with DNS round robin - getting logged out almost immediately after logging in?

msarro
Builder

Greetings everyone. We have a moderately sized distributed deployment. We have 3 search heads pooled, and all 3 have been added to a fqdn that round-robins through the IPs. We are in the process of upgrading from splunk v4 to splunk v5. On v5, when logging in using the DNSRR fqdn, almost immediately after successfully logging in the user is immediately logged back out. This was not the case in v4.

Has anyone else encountered this sort of issue? I'm not sure how to even approach it.

Tags (3)
0 Karma
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

You can't load-balance Splunk with DNS round robin. You must use a mechanism that preserves session affinity, or you will be forced to log in again whenever your browser is directed to a new server. I am surprised that it works in the older version, but my speculation is that your DNS configuration is slightly different, e.g., the TTL for the IPs may have been set much longer.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

You can't load-balance Splunk with DNS round robin. You must use a mechanism that preserves session affinity, or you will be forced to log in again whenever your browser is directed to a new server. I am surprised that it works in the older version, but my speculation is that your DNS configuration is slightly different, e.g., the TTL for the IPs may have been set much longer.

msarro
Builder

Excellent, I will give GSLB a shot. Luckily, I may have located some cheap load balancers as well, so I can tackle it in our next sprint. Thanks so much for your help!

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

GSLB based on DNS usually sets the TTL very low, specifically so that the DNS server can then change the host without the IP getting stuck too long in a client cache. On the other hand, it should not do round robin to the same client, in order for it to work correctly, it needs to keep sending the same address back to any given client, until a failure occurs and it has to tell the client to use a new target IP.

gkanapathy
Splunk Employee
Splunk Employee

the TTL specifies how long the client should cache a DNS entry. so if your old one was set to 86400 seconds, while your new is 30 seconds, you'll have problems.

msarro
Builder

Really? I'd have thought that the local dns cache would have stored the IP of the last server used. Darn - sounds like I need to start begging for a load balancer. Is it known whether or not GSLB will work? Anything to avoid exhausting a tight budget, lol.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...