I haven't been able to find a definitive answer anywhere in the Splunk documentation and I'm curious what the official consensus to this question is.
Scenario: A single Splunk deployment server is deploying a variety of configuration bundles to 100+ universal forwarders. These bundles include certain configs such as inputs.conf, outputs.conf, server.conf, etc. Suddenly the deployment server has physical hardware failure. The server will be down for 24+ hours (hypothetical for the scenario). What will the deployment clients do?
Will they'll continue to attempt contact with the deployment server indefinitely? I've heard some people state that over a specific period of time the forwarders would dump their apps if they couldn't re-establish connection with the deployment server. Is this the expected behavior? Will the forwarders keep their apps and configuration bundles indefinitely until the deployment server comes back up and they can reestablish connectivity or until they are pointed to another deployment server?
Has anyone experienced a prolonged deployment server outage? What did the deployment clients do? Is there any official documentation regarding this situation that others have seen that I may have missed?
If you DS is down for an extended period of time, the clients will continue to try and contact the DS via the configured phonehomeinterval. They will not dump apps or stop working.
Once the DS comes back up, as long as the checksums are the same, then nothing will happen. However, if you build a new DS, even with the same apps, the checksums used to match the apps will be different (unless you restore..) And this will trigger the clients to re-download all the apps they are associated with in the DS's serverclasses.
Thanks esix, I did some additional testing and can confirm all of the above.