Solved: Strategies for moving indexes to new datacenter

kutzi · ‎06-10-2020

Are there any recommended strategies for moving the index data to a new data center?

We're planning to build up a new Splunk cluster (indexers, searchheads, master, license-master) in a new data center. We want to migrate the existing index data.
The most straightforward solution would probably be:

shut down old cluster
copy indexer data
start new cluster

However, we have sth. like 4-5TB of index data, so that would take a real long amount of (down) time.

Are there better solutions?
I was thinking about extending the existing indexer cluster to the new DC, increase the repl. factor so that it's guaranteed that one of the indexers in the new DC must have the data.
Then when everything is synced, shut down the old indexers, disconnect the new indexers from the old master, connect them to the new master. Done

Has anyone had experience with such a scenario? Or any other proposed solutions?

jcrabb_splunk · ‎06-10-2020

Assuming you have the bandwidth & connectivity, you could simply add the new peers to the existing index cluster. Once SF/RF have been met, offline the "old" peers one at a time. Then you have the data and configuration in the new location.

Offline Reference: Take a peer down permanently

Then you can replace the master node through this documented process: Replace Master Node . Finally, you would swap the License Master through this documented process: Swap License Master .

Jacob
Sr. Technical Support Engineer

View solution in original post

jcrabb_splunk · ‎06-10-2020

Assuming you have the bandwidth & connectivity, you could simply add the new peers to the existing index cluster. Once SF/RF have been met, offline the "old" peers one at a time. Then you have the data and configuration in the new location.

Offline Reference: Take a peer down permanently

Then you can replace the master node through this documented process: Replace Master Node . Finally, you would swap the License Master through this documented process: Swap License Master .

Jacob
Sr. Technical Support Engineer

kutzi · ‎07-06-2020

Having tried that successfully with 2 test clusters, I'm now trying it on our actual cluster and the replication seems to be stuck resp. probably hasn't really started, yet.
I see that the number of buckets on the new indexers has risen to ~half of the old indexers, but they don't seem to contain data, yet, as the disks are still pretty much empty.

A couple of WARNs in the log files, but nothing where I could really pinpoint the issue. Most suspectible IMO:

WARN  AdminHandler:AuthenticationHandler - Denied session token for user: splunk-system-user

(based on other forum posts this seems to point to shd -> indexer issue and not about the index cluster)

or

 CMMaster - event=handleReplicationError bid=main~2682~15487F52-1A61-440B-821A-BDA4AA62E608 tgt=DE69C0E9-DE8B-4863-B37E-734D138BDCCA peer_name=splunk-ind-02.live.eu-central-1.zeal.zone msg='target doesn't have bucket now. ignoring'

Also strange that the new indexers have an empty value in the 'Indexer Cluster' column in the Monitoring Console/Instances view.

kutzi · ‎07-06-2020

Update: seems that the issue might have been that the new indexers had a newer version (8.0.4.1) than the old master and indexers (8.0.1)
After using 8.0.1 for the new indexers, too, it looks better now and the replication is apparently making progress.

codebuilder · ‎06-10-2020

I've performed this very task myself, and it's fairly tedious, but does work.

The easiest way is to ensure that you have the exact same number of indexers in the new DC as old, and ensure they have enough storage presented to handle your data. High level steps below:

1. Stop ALL forwarders.

2. On the new index cluster, deploy indexes.conf via the master. This will create the filesystems required.

3. Shut down both old and new index clusters.

4. Use rsync to transfer data from old indexer to new, meaning from old_indexer_01 to new_indexer_01

5. Repeat that on each indexer in your cluster (can be ran simultaneously).

6. On the new indexers, ensure that the filesystem is owned by your Splunk user (chown -RP splunk:splunk <your_directories>

7. Once rsync completes on all nodes, bring up the NEW index cluster.

8. Use tstats, etc to verify event counts, etc and that all data is searchable.

9. Use DS to re-configure forwarders to point to new master, then start them back up.

10. Verify new events are coming in to new index cluster.

----
An upvote would be appreciated and Accept Solution if it helps!

kutzi · ‎06-10-2020

But that's an offline migration, right?
That's what I'm trying to avoid.

codebuilder · ‎06-11-2020

Yes, it is an offline method. But is the fastest and most reliable method that I've found.

I did neglect to leave out one very important step however. Before taking down the indexers be sure to roll all hot buckets to warm (after stopping the forwarders).

----
An upvote would be appreciated and Accept Solution if it helps!

jcrabb_splunk · ‎06-11-2020

No, you have the new index cluster peers in the data center which are part of the index cluster. To remove the "old peers" (previously existing index cluster peers), you would have to "offline" them.

From a doc I linked to on my previous post: https://docs.splunk.com/Documentation/Splunk/8.0.4/Indexer/Takeapeeroffline#Take_a_peer_down_permane...

The enforce-counts version of the offline command is intended for use when you want to take a peer offline permanently, but only after the cluster has returned to its complete state.

The index cluster itself would still be up and available. You would simply use

splunk offline --enforce-counts

On each of the existing peers that you want to remove.

Jacob
Sr. Technical Support Engineer

kutzi · ‎06-11-2020

codebuilder. Unfortunately the forum doesn't displays that well.
Thanks for the clarification, though!

jcrabb_splunk · ‎06-11-2020

Oh gotcha, I'm still getting used to our new community site as well. Thanks!

Jacob
Sr. Technical Support Engineer

Strategies for moving indexes to new datacenter

capacity planning

indexer clustering

search head clustering

workload management

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

SignalFlow: What? Why? How?