Merge 2 index clusters

Sherlock_Data · ‎07-03-2024

Hello,
I would like to merge 2 index clusters.

Context
- 2 indexer clusters
- 1 search head cluster
Objectives
- Add new indexers to cluster B.
- Move data from cluster A to cluster B.
- Remove cluster A.
Constraint
- Keep service interruptions to a minimum.

What do you think of this process:

Before starting
- Make sure the clusters have same Splunk version.
- Make sure the clusters have same configuration.
- Make sure volumes B can absorb indexes A.
- Make sur common indexes have the same configuration. If not, define their final configuration.
Add new peer nodes
- Install new peer nodes.
- Add new peer nodes in cluster B.
- Rebalance data.
- Add new peer nodes in outputs.conf and restart.
Move data
- Remove peer nodes A from outputs.conf and restart.
- Move indexes configuration from A to B.
- Copy peer apps from A to B.
- Put peer nodes A in manual detention to stop replication from other peer nodes.
- Add peer nodes A in cluster B.
Remove peers node A
- One indexer at a time:
- Remove peer node A from cluster B.
- Wait for all the fixup tasks to complete to get the cluster meet search and factors.
- Rebalance data.
Finally
- Make sure there is no major issue in the logs.
- Update diagram and inventory files (spreadsheets, inventory files, lookups, etc.).
- Update dashboards and reports if necessary.

PickleRick · ‎07-06-2024

It's a bit complicated and can get messy. Remember that as soon as you add the peer to a cluster, it announces its all buckets to the CM and the CM will try to find a way to meet RF/SF across the cluster. Even if a peer is in detention, it still can and will be a _source_ of replication. It won't only be the target of replicated buckets.

So it's more complicated than it seems and you'll most probably will end up with extra copies of your buckets which you will have to get rid of manually.

It would be easiest to add new nodes to cluster B, add cluster A as search peers to your SH layer, reconfigure your outputs to send to cluster B only and just wait for the data in A to freeze.

But if you can't afford that, as you'll still be installing new nodes anyway, you could try do something else (this doesn't include rolling hot buckets in cluster A if necesssary):

1) install new nodes for cluster B (don't add them to cluster yet)

2) Find primary copies of buckets in cluster A

3) Copy over only a primary copy for each bucket from cluster A to the new nodes (don't copy all of them into one indexer - spread them over the new boxes)

4) Put the cluster in maintenance mode

5) Add the new nodes to cluster B

6) Disable maintenance mode, rebalance buckets.

7) Decommission cluster A

That _could_ work but I'd never do that in prod before testing in lab.

isoutamo · ‎07-06-2024

Hi

what is the issue which you try to solve?

Merging buckets (that is what you are trying to do) between two different indexer clusters are something what I really don't propose you to do. Especially if/when you have same indexes on both clusters. There will be conflicts with bucket numbering etc. which leads to service interruptions.

Best way is to create missed indexes on cluster B, then update outputs from UFs of cluster A to point cluster B. Then just disable external receiving on cluster A. After that decrease node amount in cluster A to minimum and wait that data has expired on it.

r. Ismo

Sherlock_Data · ‎07-16-2024

Hi,

I would like to merge two different index clusters. One has always been here and the other have been added after from an existing environment.

Except for internal indexes, each cluster have their own indexes.

The "expiration scenario" is the last option we want because we would like to remove cluster A servers as they are too old.

shivanshu1593 · ‎07-04-2024

The one issue that I see with this approach is the transfer of data buckets from Cluster A to Cluster B. Every indexer creates buckets with its unique uuid. Transferring those buckets to a new cluster where the cluster master has no idea as to who those buckets belong to would cause a massive headache for you.

If you can re-ingest the data, that would solve your problem easily. Otherwise, I highly recommend involving Splunk support in this operation and get this guidance.

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###

Sherlock_Data · ‎07-16-2024

It's impossible to reingest all the data as they are collected since years.

One of my first tasks is to check bucket's ids and make sure there ie no duplicates but I'm pretty sure there is not.

Merge 2 index clusters

indexer clustering

Splunk Smartness with Brandon Sternfield | Episode 3

Monitoring Postgres with OpenTelemetry

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly