Deployment Architecture

Blue-Green deployment for Splunk cluster for achieving near zero downtime during any upgrade

shaileshr1
Engager

I have a Splunk cluster consisting of a Master , 2 search-heads and 2 indexers. The indexers receive logs from forwarders as well as through AWS plugin. How do I achieve 0 (near zero) downtime during upgrade of this cluster and ensure no data-loss ?

Tags (1)

nickhills
Ultra Champion

You cant avoid restarting stand alone search heads. So you will need to prep for some disruption there. (Unavoidable without a SH cluster) - but it's quick. 5 minutes or so. Do it during a quiet period. Also make sure you upgrade SHs first!

You don't mention what version you are running, but if its later than 7.1 then your indexer cluster can be upgraded with minimal disruption to search/indexing operations if you follow the guidelines here:
https://docs.splunk.com/Documentation/Splunk/8.0.2/Indexer/Searchablerollingupgrade

You also don't mention what your datasources are: If its file monitors or windows events, then there is minimal risk to data loss during the upgrade as pending logs will just wait until the indexers are available, and then send any data which was paused.

If you are using syslog, it depends if you are sending data directly to splunk, or via a syslog server+UF. The former will likely cause gaps in your logs, the latter should not.

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...