Deployment Architecture

After a Deployment Server outage, clients all reconnect at once. Any way to stop this?

Jason
Motivator

So our Deployment Server was down for some time, beyond that of the clients' checkin interval, and now that it is back up it is being overwhelmed by hundreds of clients checking in within a few thousanths of a second of each other.

Can anyone think of a way to stop this, without using another tool to sequentially restart splunk on every forwarder?

This is going in as a bug, there's no reason Splunk shouldn't continue to use the checkin interval if unable to connect, or at least backoff and not try every 10-15 seconds.

dstaulcu
Builder

handshakeRetryIntervalInSecs =
* Defaults to phoneHomeIntervalInSecs [does not seem to be true]
* This sets the handshake retry frequency.
* Could be used to tune the initial connection rate on a new server

http://docs.splunk.com/Documentation/Splunk/6.0.1/admin/Deploymentclientconf

0 Karma
Get Updates on the Splunk Community!

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

Join us on Wed, Dec 10. at 10AM PST / 1PM EST for a live webinar and demo with Splunk experts! Discover how ...

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...