Getting Data In

Is there a way to migrate indexed data from a legacy standalone indexer to a new indexer cluster?

rewritex
Contributor

I've read through quite a few pages and there are mixed partial solutions.

Is there a way to migrate indexed data from a standalone deployment into a new indexer cluster deployment?

Currently researched this approach:
1) Create the indexes with the same names as the standalone within the indexer cluster
2) copy/paste the index contents from the legacy indexes into the new indexer cluster directories.
\
I've read that I enabling the standalone indexer as a cluster peer only replicates new/ongoing data and does not replicate the legacy data. And the other option I've seen is to just contact professional services ....

esix_splunk
Splunk Employee
Splunk Employee

As jkat54 mentions, your process is basically correct. The standard process for migrating Single Instance to buckets to single/multisite cluster typically looks like the following:

1) Cut over all forwarders to the new indexers, validate no data is coming into the old instance
2) Restart Splunk in order to roll all hot buckets to warm
3) Copy / Migrate all existing single instance buckets to the new Clustered instances destination index.
4) Restart the indexer(s) and look for potential bucket collisions (there shouldnt be any because buckets have a different naming structure in standalone vs clusters.
5) Confirm the data is searchable. (search the raw, also recommend using dbinspect to validate the buckets and time ranges in the migrated buckets..

Rinse and repeat as needed.

Again, its worth noting that this process will migrate the buckets as standalone buckets and they wont be candidates for replication. You could rename all these buckets to match the server's GUID (look at the existing buckets) and they will be eligible for replication assuming there are no collisions. However, as the docs state, this could involve a large amount of time was the buckets have to go through the replication process. And this isnt recommended....

Raghav2384
Motivator

We do bucket shufflings within our Indexer clusterz on weekly basis 🙂 (Few Indexers have less disk size than the others...don't judge me please)

What @jkat and @esix [Splunk] provided is all you need. Though no buckets exist on the new cluster, i would still recommend you to bump the bucket ids on the standalone by a 1000 example db_1234567_1234567_100 to db_1234567_1234567_1000 and move it to new cluster (1:3 ratio).

Note: For moving hot_v buckets, you have to force them to warm first and then move.

As long as you foloow the suggestions as is, you should be good. Like i said, we all were scared the first time..Now shuffling the buckets is part of our weekly maintenance 😉

Thanks
Raghav

0 Karma

gcusello
SplunkTrust
SplunkTrust

You cannot! The only way could be to reindex all.
I Created a new index in the cluster with another name (e.g. c_index1) and an eventtype with both the old and the new clustered indexes ( e.g. Index=index1 OR index=c_index1).
Bye.
Giuseppe

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

This is incorrect, you can migrate the previously indexed data over, however it wont be replicated because the buckets have a different naming convention.

jkat54
SplunkTrust
SplunkTrust

kutzi
Path Finder
0 Karma

jkat54
SplunkTrust
SplunkTrust

So the link I provided says "call professional services" but you can do it. The issue is the data won't be replicated but it will at least be available to search if you do it that way. I recommend dropping the buckets in "thawed". Also you'll want to role all buckets to warm from hot before beginning the migration.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...