Splunk Enterprise

How should I handle Splunk Cluster Migration to New Hardware?

rtongue
Observer

Greetings, everyone.

I apologize if this question has been answered before, but I really have a requirement to get a deeper understanding on how to proceed with this. We currently have 2 Splunk Enterprise indexer clusters, one of them is our prod infrastructure spanned across two geo-separated datacenters, with 8 nodes total, 4 in each geo-site. We also have nonprod, which is a very similar setup, but only one physical site, with 4 nodes making up the cluster.

We have recently been asked to assist in migrating these clusters to brand new physical servers and have questions on the best way to proceed. First, we have local SSD storage arrays on our current physical hosts (hot tier), and our "colddb" is located on a chunk of SAN storage, connected by Fiber-channel. This is where the wrinkle is. We are not getting new SAN storage for "colddb", so we will not be able to stand these new servers up and add them to the cluster as 9th nodes, let it replicate, then remove the one it replaces, getting us back to 8, repeating for all nodes. Instead, we will have to remove the SAN allocation from the old nodes and attach to the new nodes making this type of migration impossible.

My initial assumption is that instead, we will need to decom a node, and replace with a new node, one at a time, as if a node failedAm I correct in this assumption?

Is there a better way to handle this, or am I stuck with the current situation? Thanks for your time.

Labels (2)
Tags (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
0 Karma

rtongue
Observer

Thanks for the link, however I don't feel like it addresses my main concern, and that is the fact that I don't have new storage to utilize in this process and would have to decom a node before installing new.  That seems dangerous. 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

If you cannot borrow additional disk space for colddb for migration time then you have two options 

  1. lost / frozen some data which don’t have space during migration as guided on previous post
  2. replace nodes one by one and detach / attach SAN disks from old to the new

1st one is much safer and easier option. 2nd one can lead situations when you could lose events in worst case. 

On option 2 you should set up a new node without SAN replicate SSD storage, splunk software and configuration from old node. If you are using rpm or dep packages, install first then use rsync to replicate it from old node. Be sure that you have splunk.secret, GUID and all other configuration from old. Then shutdown old instance, replicate from old final sync with rsync remove option, detach SAN disk and move those to the new node. Ensure that you have correct volume group and file system definitions with correct permissions. After that you should bring a new node up as an old one. Probably it’s good to increase some timeout options for cluster to avoid unneeded bucket replications within other peers in cluster. 

As you see this is little bit complicate procedure, but it’s doable. Fortunately you have test environment where you could training with this and write step by step instructions for production migration.

0 Karma
Get Updates on the Splunk Community!

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...