Deployment Architecture

Migrating hot/warm and cold buckets to separate drives

ejharts2015
Communicator

This is more an FYI as to how we did it than an actual question. Took us a while to find all the info we needed to be able to split up the warm/hot and cold buckets to their own drives on an existing indexer AND an indexer in the cluster, so here's how we did it. So hopefully posting here will help alleviate the pain others might go through.

Our environment consists of a 3 node indexer cluster controlled by one master. Our Search Factor and Replication Factor were both 2. All our data was originally stored in one volume: /mnt/idx-storage. We'd like to move to a /mnt/idx-storage-warm and and /mnt/idx-storage-cold to split these buckets up. We'd like ~1 month of data on the warm/hot buckets and the rest on the cold. Our cluster lives on AWS.

Here's how we did it.

1 Solution

ejharts2015
Communicator

We did a bunch of math to see how much storage space we'd need to do to determine drive sizes. For our setup/data ingestion we went with a 2 TB SSD (hot/warm) and a 16 TB (cold) storage solution. For the cold drives were using the new AWS cold storage as a cheaper alternative to long storage.

We put our cluster in maintenance mode and took down one indexer at a time, did all the following steps and brought it back up. We left the master in maintenance mode until all three indexers had been updated. We also pushed out a new indexes.conf file with the added "homePath.maxDataSizeMB = [number]" stanza to make our warm buckets small enough to fit on our new 2 TB drive.

  1. Create 2 disks in the availability zone of the box (please note our COLD drive was cloned from a 8TB snapshot of our existing drive: SSD - 2000 GiB COLD - 16387 GiB (Max size)
  2. Create two new mount points: sudo mkdir /mnt/idx-storage-warm sudo mkdir /mnt/idx-storage-cold
  3. Determine which drives are where: lslbk - (please note ours were located at /dev/xvdg and /dev/xvdh, please make sure you use the right /dev location on your install)
  4. ONLY If your drives are new drives, add a file system. If they are clones of your old data, then DO NOT do this. sudo mkfs -t ext4 /dev/xvdg sudo mkfs -t ext4 /dev/xvdh
  5. Mount the drives: sudo mount /dev/xvdg /mnt/idx-storage-warm/ sudo mount /dev/xvdh /mnt/idx-storage-cold/
  6. Modify the /etc/fstab to automount the drives when the box restarts: /dev/xvdg /mnt/idx-storage-warm ext4 defaults,nobootwait 0 0 /dev/xvdh /mnt/idx-storage-cold ext4 defaults,nobootwait 0 0
  7. Migrate the data. The indexes are stored in five different folders based on the type of data there under each of your indexes. This makes for quite the mess. They're located on the filesystem at: /opt/splunk/var/lib/splunk/[name-of-index]/[colddb, thaweddb, db, summary, datamodel_summary]/* 7.5 - We found out later that you should roll your hot buckets to warm before copying them or you'll run into issues. We just deleted all the hot buckets and lost that data, but there's actually a CLI command to roll them all for you so you don't have to do this: More info here from Splunk docs
  8. What we ended up doing was snapshotting our existing entire 8TB drive and copying it to the cold storage then rsyncing the warm buckets ONLY (not hot buckets) to the warm drive location. rsync -avpR --progress --exclude '/opt/splunk/var/lib/splunk//colddb/' --exclude '/opt/splunk/var/lib/splunk//thaweddb/' --exclude '/opt/splunk/var/lib/splunk//db/hot*' /opt/splunk/ /mnt/idx-storage-warm/*
  9. We needed to resize the cold drive from the 8TB copy to the 16TB new size, which you can do live while mounted: sudo resize2fs /dev/xvdh
  10. Modify the splunk-launch.conf file to indicate where splunk should launch from: sudo nano /mnt/idx-storage-warm/opt/splunk/etc/splunk-launch.conf We changed /mnt/idx-storage to /mnt/idx-storage-warm/ for both SPLUNK_HOME and SPLUNK_DB
  11. Update the indexes.conf config locally (this would be the one normally pushed by the master), we'll fix this later on in the install. sudo nano /mnt/idx-storage-warm/opt/splunk/etc/slave-apps/_cluster/local/indexes.conf
  12. Here's a sample too on our how our config looks. Note how the cold path and thawed path both point to the cold storage and the home path points to the warm storage. Frozen time is our retention period before data is deleted and homepath states how many MB of hot/warm to keep before rolling it to the cold storage (sadly you can only do this by space and not time). [our_data_index] repFactor = auto coldPath = /mnt/idx-storage-cold/opt/splunk/var/lib/splunk/our_data_index/colddb homePath = /mnt/idx-storage-warm/opt/splunk/var/lib/splunk/our_data_index/db thawedPath = /mnt/idx-storage-cold/opt/splunk/var/lib/splunk/our_data_index/thaweddb frozenTimePeriodInSecs = 315569520 homePath.maxDataSizeMB = 5000
  13. Resymlink the new warm storage to the normal splunk install location: sudo rm /opt/splunk sudo ln -s /mnt/idx-storage-warm/opt/splunk/ /opt/splunk
  14. Remove the old drive: sudo umount /mnt/idx-storage
  15. Update the init.d scripts: sudo /mnt/idx-storage-warm/opt/splunk/bin/splunk disable boot-start sudo /mnt/idx-storage-warm/opt/splunk/bin/splunk enable boot-start -user splunk
  16. Fix any permissions that might have gotten screwed up: sudo chown -R splunk:splunk /mnt/idx-storage-cold/opt/splunk sudo chown -R splunk:splunk /mnt/idx-storage-warm/opt/splunk
  17. Turn back on indexer, wait patiently (and I do mean patiently) for it to come back up and rejoin the cluster. Then move onto another indexer.

Some other cleanup items:
1. Removing unneeded data from cold storage (since we cloned the drive)
2. Removing/Detaching the old 8TB drive.

Good luck! Hope this helps!

View solution in original post

ejharts2015
Communicator

We did a bunch of math to see how much storage space we'd need to do to determine drive sizes. For our setup/data ingestion we went with a 2 TB SSD (hot/warm) and a 16 TB (cold) storage solution. For the cold drives were using the new AWS cold storage as a cheaper alternative to long storage.

We put our cluster in maintenance mode and took down one indexer at a time, did all the following steps and brought it back up. We left the master in maintenance mode until all three indexers had been updated. We also pushed out a new indexes.conf file with the added "homePath.maxDataSizeMB = [number]" stanza to make our warm buckets small enough to fit on our new 2 TB drive.

  1. Create 2 disks in the availability zone of the box (please note our COLD drive was cloned from a 8TB snapshot of our existing drive: SSD - 2000 GiB COLD - 16387 GiB (Max size)
  2. Create two new mount points: sudo mkdir /mnt/idx-storage-warm sudo mkdir /mnt/idx-storage-cold
  3. Determine which drives are where: lslbk - (please note ours were located at /dev/xvdg and /dev/xvdh, please make sure you use the right /dev location on your install)
  4. ONLY If your drives are new drives, add a file system. If they are clones of your old data, then DO NOT do this. sudo mkfs -t ext4 /dev/xvdg sudo mkfs -t ext4 /dev/xvdh
  5. Mount the drives: sudo mount /dev/xvdg /mnt/idx-storage-warm/ sudo mount /dev/xvdh /mnt/idx-storage-cold/
  6. Modify the /etc/fstab to automount the drives when the box restarts: /dev/xvdg /mnt/idx-storage-warm ext4 defaults,nobootwait 0 0 /dev/xvdh /mnt/idx-storage-cold ext4 defaults,nobootwait 0 0
  7. Migrate the data. The indexes are stored in five different folders based on the type of data there under each of your indexes. This makes for quite the mess. They're located on the filesystem at: /opt/splunk/var/lib/splunk/[name-of-index]/[colddb, thaweddb, db, summary, datamodel_summary]/* 7.5 - We found out later that you should roll your hot buckets to warm before copying them or you'll run into issues. We just deleted all the hot buckets and lost that data, but there's actually a CLI command to roll them all for you so you don't have to do this: More info here from Splunk docs
  8. What we ended up doing was snapshotting our existing entire 8TB drive and copying it to the cold storage then rsyncing the warm buckets ONLY (not hot buckets) to the warm drive location. rsync -avpR --progress --exclude '/opt/splunk/var/lib/splunk//colddb/' --exclude '/opt/splunk/var/lib/splunk//thaweddb/' --exclude '/opt/splunk/var/lib/splunk//db/hot*' /opt/splunk/ /mnt/idx-storage-warm/*
  9. We needed to resize the cold drive from the 8TB copy to the 16TB new size, which you can do live while mounted: sudo resize2fs /dev/xvdh
  10. Modify the splunk-launch.conf file to indicate where splunk should launch from: sudo nano /mnt/idx-storage-warm/opt/splunk/etc/splunk-launch.conf We changed /mnt/idx-storage to /mnt/idx-storage-warm/ for both SPLUNK_HOME and SPLUNK_DB
  11. Update the indexes.conf config locally (this would be the one normally pushed by the master), we'll fix this later on in the install. sudo nano /mnt/idx-storage-warm/opt/splunk/etc/slave-apps/_cluster/local/indexes.conf
  12. Here's a sample too on our how our config looks. Note how the cold path and thawed path both point to the cold storage and the home path points to the warm storage. Frozen time is our retention period before data is deleted and homepath states how many MB of hot/warm to keep before rolling it to the cold storage (sadly you can only do this by space and not time). [our_data_index] repFactor = auto coldPath = /mnt/idx-storage-cold/opt/splunk/var/lib/splunk/our_data_index/colddb homePath = /mnt/idx-storage-warm/opt/splunk/var/lib/splunk/our_data_index/db thawedPath = /mnt/idx-storage-cold/opt/splunk/var/lib/splunk/our_data_index/thaweddb frozenTimePeriodInSecs = 315569520 homePath.maxDataSizeMB = 5000
  13. Resymlink the new warm storage to the normal splunk install location: sudo rm /opt/splunk sudo ln -s /mnt/idx-storage-warm/opt/splunk/ /opt/splunk
  14. Remove the old drive: sudo umount /mnt/idx-storage
  15. Update the init.d scripts: sudo /mnt/idx-storage-warm/opt/splunk/bin/splunk disable boot-start sudo /mnt/idx-storage-warm/opt/splunk/bin/splunk enable boot-start -user splunk
  16. Fix any permissions that might have gotten screwed up: sudo chown -R splunk:splunk /mnt/idx-storage-cold/opt/splunk sudo chown -R splunk:splunk /mnt/idx-storage-warm/opt/splunk
  17. Turn back on indexer, wait patiently (and I do mean patiently) for it to come back up and rejoin the cluster. Then move onto another indexer.

Some other cleanup items:
1. Removing unneeded data from cold storage (since we cloned the drive)
2. Removing/Detaching the old 8TB drive.

Good luck! Hope this helps!

aaraneta_splunk
Splunk Employee
Splunk Employee

Hi @ejharts2015 - If you could put your "How you did it" as an Answer below and "Accept" it, that would be great. That way your post doesn't look like it has been unanswered forever. Thanks!

ejharts2015
Communicator

Yup, sorry it was a long one, so just finished typing it up 🙂

0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

No prob, thanks for giving such a great and detailed answer! 😄

0 Karma

skoelpin
SplunkTrust
SplunkTrust

I did this exact thing a few weeks ago. It looks like you forgot to add the "How you did it part". I'm very interested in seeing your approach

Here was mine
https://answers.splunk.com/answers/474795/whats-the-best-way-to-migrate-db-from-one-drive-to.html

ejharts2015
Communicator

Posted it now. We had a lot of fun with the bucket collisions as well 🙂

0 Karma
Get Updates on the Splunk Community!

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...

Explore the Latest Educational Offerings from Splunk (November Releases)

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...