Deployment Architecture

What are the expected results if a Splunk deployment server goes down for an extended period of time?

RJ_Grayson
Path Finder

I haven't been able to find a definitive answer anywhere in the Splunk documentation and I'm curious what the official consensus to this question is.

Scenario: A single Splunk deployment server is deploying a variety of configuration bundles to 100+ universal forwarders. These bundles include certain configs such as inputs.conf, outputs.conf, server.conf, etc. Suddenly the deployment server has physical hardware failure. The server will be down for 24+ hours (hypothetical for the scenario). What will the deployment clients do?

Will they'll continue to attempt contact with the deployment server indefinitely? I've heard some people state that over a specific period of time the forwarders would dump their apps if they couldn't re-establish connection with the deployment server. Is this the expected behavior? Will the forwarders keep their apps and configuration bundles indefinitely until the deployment server comes back up and they can reestablish connectivity or until they are pointed to another deployment server?

Has anyone experienced a prolonged deployment server outage? What did the deployment clients do? Is there any official documentation regarding this situation that others have seen that I may have missed?

0 Karma
1 Solution

esix_splunk
Splunk Employee
Splunk Employee

If you DS is down for an extended period of time, the clients will continue to try and contact the DS via the configured phonehomeinterval. They will not dump apps or stop working.

Once the DS comes back up, as long as the checksums are the same, then nothing will happen. However, if you build a new DS, even with the same apps, the checksums used to match the apps will be different (unless you restore..) And this will trigger the clients to re-download all the apps they are associated with in the DS's serverclasses.

View solution in original post

esix_splunk
Splunk Employee
Splunk Employee

If you DS is down for an extended period of time, the clients will continue to try and contact the DS via the configured phonehomeinterval. They will not dump apps or stop working.

Once the DS comes back up, as long as the checksums are the same, then nothing will happen. However, if you build a new DS, even with the same apps, the checksums used to match the apps will be different (unless you restore..) And this will trigger the clients to re-download all the apps they are associated with in the DS's serverclasses.

RJ_Grayson
Path Finder

Thanks esix, I did some additional testing and can confirm all of the above.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...