Archive

Rebalancing Indexes On New Larger Index Nodes

New Member

We are looking to upgrade our Splunk Indexers to a set of more performant AWS instances with more disk space available to them.

I was reading about data rebalancing and it said that it makes the assumption that the amount of disk space available was the same across all instances.

I was wondering if the following is possible:

  1. Add the new larger index nodes to the index cluster
  2. Start a data rebalance to have the data distributed across all now 5 nodes
  3. Remove the original two smaller indexers

I understand that an alternative step might be to just rsync the data to the new nodes and pivot, but I would like to avoid the down time if I can.

0 Karma
1 Solution

Legend

tl;dr - what you have suggested will work!

You have a number of options for how to replace your indexers. While rsync is one way, I agree that the downtime is not good - and there are other potential issues; I don't recommend this technique.
I could write a book on all the ways you could do this, but I will overview a couple of ideas that leverage multi-site clustering and manual detention...

Regardless of the technique, I suggest that you do this first:
1 - stand up the new improved indexers
2 - set all data to be forwarded to the new indexers only

Some questions that may help you pick a technique: First, what is the data retention for your indexed data? 30 days? a year? What is the total amount of old data that needs to be copied? Second, what is the usual timerange for user searches?

Option A - just let it expire / multi-site
For this option, you need to set multi-site clustering. Set it up so that the new indexers are on a different site than the old indexers. Set the replication and search factors so that nothing is getting replicated back to the old indexers.
Now just wait it out. As time goes on, the data (buckets) will expire on the old indexers. Don't decommission the old indexers until all the old data has expired.
Since most searches run over recent data, you will probably find that the workload very quickly shifts to the new environment.
While this technique takes a while, it causes zero downtime and has no performance issues.

Option B - gradual decommission with detention
For this to work, the old indexers must not have any non-clustered buckets. (And if you are currently doing multi-site clustering, the old indexers must not have any single-site buckets.) There is no need to change existing replication/search factors.
New data is flowing to the new indexers, but in this scenario, it could be replicated back to the old indexers. To avoid that, place each of the old indexers in manual detention. This keeps the new indexers from replicating back to the old indexers.
Next, decommission each of the old indexers, one at a time. As each indexer is decommissioned, Splunk will move the buckets onto surviving indexers as needed. This could potentially have some performance impact and take a long time per indexer, depending on the amount of data to be copied.
This option requires zero downtime.

Option C - just do it
In this option, you just decommission each of the old indexers as soon as you have the new indexers running. (But just decommission one at a time.) Without using manual detention, there is no telling where the bucket copies will go, but eventually they will all end up on the new indexers. While this has zero downtime, it will have the most performance impact as buckets may be moved repeatedly. As in option B, you need to make sure that there are no non-replicated (or single-site) buckets, because they will not be copied during decommissioning.

The method that you describe is an improved variation of option C - the data rebalance will give you a head start on the bucket copying process.

I hope that this reply gives you some ideas about what might be easiest, depending on the amount of data that you have and how quickly you want to complete the migration.

View solution in original post

New Member

Iguinn,

Thanks for the answer. Luckily our data set is small and likely not all required. I will attempt option B and report back with my findings.

0 Karma

Legend

tl;dr - what you have suggested will work!

You have a number of options for how to replace your indexers. While rsync is one way, I agree that the downtime is not good - and there are other potential issues; I don't recommend this technique.
I could write a book on all the ways you could do this, but I will overview a couple of ideas that leverage multi-site clustering and manual detention...

Regardless of the technique, I suggest that you do this first:
1 - stand up the new improved indexers
2 - set all data to be forwarded to the new indexers only

Some questions that may help you pick a technique: First, what is the data retention for your indexed data? 30 days? a year? What is the total amount of old data that needs to be copied? Second, what is the usual timerange for user searches?

Option A - just let it expire / multi-site
For this option, you need to set multi-site clustering. Set it up so that the new indexers are on a different site than the old indexers. Set the replication and search factors so that nothing is getting replicated back to the old indexers.
Now just wait it out. As time goes on, the data (buckets) will expire on the old indexers. Don't decommission the old indexers until all the old data has expired.
Since most searches run over recent data, you will probably find that the workload very quickly shifts to the new environment.
While this technique takes a while, it causes zero downtime and has no performance issues.

Option B - gradual decommission with detention
For this to work, the old indexers must not have any non-clustered buckets. (And if you are currently doing multi-site clustering, the old indexers must not have any single-site buckets.) There is no need to change existing replication/search factors.
New data is flowing to the new indexers, but in this scenario, it could be replicated back to the old indexers. To avoid that, place each of the old indexers in manual detention. This keeps the new indexers from replicating back to the old indexers.
Next, decommission each of the old indexers, one at a time. As each indexer is decommissioned, Splunk will move the buckets onto surviving indexers as needed. This could potentially have some performance impact and take a long time per indexer, depending on the amount of data to be copied.
This option requires zero downtime.

Option C - just do it
In this option, you just decommission each of the old indexers as soon as you have the new indexers running. (But just decommission one at a time.) Without using manual detention, there is no telling where the bucket copies will go, but eventually they will all end up on the new indexers. While this has zero downtime, it will have the most performance impact as buckets may be moved repeatedly. As in option B, you need to make sure that there are no non-replicated (or single-site) buckets, because they will not be copied during decommissioning.

The method that you describe is an improved variation of option C - the data rebalance will give you a head start on the bucket copying process.

I hope that this reply gives you some ideas about what might be easiest, depending on the amount of data that you have and how quickly you want to complete the migration.

View solution in original post