Splunk Enterprise

Why are we encountering an issue after a data migration with DISABLED Buckets?

edoardo_vicendo
Contributor

Hello,

We are encountering an issue after a data migration.

The data migration was needed to increase the disk performances.

Basically we moved all the Splunk data from disk1 to disk2 on a single Splunk Indexer instance belonging to a Multi-Site Splunk Indexer Cluster.

The procedure was:

  1. With Splunk running, rsync the data from disk1 to disk2
  2. Once rsync finished stop Splunk
  3. Put the Cluster in maintenance mode
  4. Perform again the rsync to copy the remaining delta from disk1 to disk2
  5. Remove disk1 and point Splunk to disk2
  6. Restart Splunk

 

Once we have restarted Splunk some buckets have been marked as DISABLED.

This is due because once at point 2 we have stopped Splunk the hot buckets have rolled to warm (on disk1).

Therefore during the rsync at point 4 those freshly rolled warm buckets of disk1 have been copied to disk2 where buckets hot with the same ID were present. Due to this the conflict happened and the buckets were marked as DISABLED.

 

So basically now DISABLED buckets could have more data (but not all the data) than the non disabled ones. Furthermore non disabled ones have been replicated within the cluster.

Do you think there is a way to recover those DISABLED buckets so that they will be searchable again?

I see here:

https://community.splunk.com/t5/Deployment-Architecture/What-is-the-naming-convention-behind-the-db-...

https://docs.splunk.com/Documentation/Splunk/latest/Indexer/HowSplunkstoresindexes#Bucket_naming_con...

it seems the solution could be if I well understood (with Splunk instance not running) move the data from for example DISABLED-db_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694 to db_1631215114_1631070671_herechangethebucketID_3C08D28D-299A-448E-BD23-C0E9B071E694

If so:

  • how many digit are allowed for the bucketID?
  • Does someone has any experience on doing so?
  • Once done the new buckets will be replicated within the cluster?

 

Here is what I find in the internal logs checking for one of the affected bucket:

Query:

 

index=_internal *1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694 source!="/opt/splunk/var/log/splunk/splunkd_ui_access.log"  source!="/opt/splunk/var/log/splunk/remote_searches.log" | sort -_time

 

Result:

 

09-09-2021 14:18:41.758 +0200 INFO  HotBucketRoller - finished moving hot to warm bid=_internal~448~3C08D28D-299A-448E-BD23-C0E9B071E694 idx=_internal from=hot_v1_448 to=db_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694 size=10475446272 caller=size_exceeded _maxHotBucketSize=10737418240 (10240MB,10GB), bucketSize=10878386176 (10374MB,10GB)



09-09-2021 14:18:41.767 +0200 INFO  S2SFileReceiver - event=rename bid=_internal~448~3C08D28D-299A-448E-BD23-C0E9B071E694 from=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/448_3C08D28D-299A-448E-BD23-C0E9B071E694 to=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694



09-09-2021 14:18:41.795 +0200 INFO  S2SFileReceiver - event=rename bid=_internal~448~3C08D28D-299A-448E-BD23-C0E9B071E694 from=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/448_3C08D28D-299A-448E-BD23-C0E9B071E694 to=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694



09-09-2021 14:18:41.817 +0200 INFO  S2SFileReceiver - event=rename bid=_internal~448~3C08D28D-299A-448E-BD23-C0E9B071E694 from=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/448_3C08D28D-299A-448E-BD23-C0E9B071E694 to=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694



09-09-2021 15:53:19.476 +0200 INFO  DatabaseDirectoryManager - Dealing with the conflict bucket="/products/data/xxxxxxxxx/splunk/db/_internaldb/db/db_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694"...



09-09-2021 15:53:19.477 +0200 ERROR DatabaseDirectoryManager - Detecting bucket ID conflicts: idx=_internal, bid=_internal~448~3C08D28D-299A-448E-BD23-C0E9B071E694, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/hot_v1_448, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/db_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-db_1631215114_1631070671_448_3C08D28D-299A-448E-BD23-C0E9B071E694. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~595~E17D5544-7169-4D32-B7C0-3FD972956D4B, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/595_E17D5544-7169-4D32-B7C0-3FD972956D4B, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1628215818_1627992904_595_E17D5544-7169-4D32-B7C0-3FD972956D4B. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1628215818_1627992904_595_E17D5544-7169-4D32-B7C0-3FD972956D4B. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~591~12531CC6-0C79-473A-859E-9ADF617941A2, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/591_12531CC6-0C79-473A-859E-9ADF617941A2, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1628647804_1628215848_591_12531CC6-0C79-473A-859E-9ADF617941A2. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1628647804_1628215848_591_12531CC6-0C79-473A-859E-9ADF617941A2. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~606~1D0FBF00-A5FF-4767-A044-F3C6F01BAD84, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/606_1D0FBF00-A5FF-4767-A044-F3C6F01BAD84, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1630204023_1629772040_606_1D0FBF00-A5FF-4767-A044-F3C6F01BAD84. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1630204023_1629772040_606_1D0FBF00-A5FF-4767-A044-F3C6F01BAD84. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~603~1D0FBF00-A5FF-4767-A044-F3C6F01BAD84, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/603_1D0FBF00-A5FF-4767-A044-F3C6F01BAD84, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631172918_1631063432_603_1D0FBF00-A5FF-4767-A044-F3C6F01BAD84. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1631172918_1631063432_603_1D0FBF00-A5FF-4767-A044-F3C6F01BAD84. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~436~2E5A3717-4C0C-487C-87D3-A7127B3DB42D, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/436_2E5A3717-4C0C-487C-87D3-A7127B3DB42D, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631196626_1631073242_436_2E5A3717-4C0C-487C-87D3-A7127B3DB42D. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1631196626_1631073242_436_2E5A3717-4C0C-487C-87D3-A7127B3DB42D. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~589~12531CC6-0C79-473A-859E-9ADF617941A2, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/589_12531CC6-0C79-473A-859E-9ADF617941A2, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631199124_1630935298_589_12531CC6-0C79-473A-859E-9ADF617941A2. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1631199124_1630935298_589_12531CC6-0C79-473A-859E-9ADF617941A2. Please check this disabled bucket for manual removal.\nDetecting bucket ID conflicts: idx=_internal, bid=_internal~594~E17D5544-7169-4D32-B7C0-3FD972956D4B, path1=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/594_E17D5544-7169-4D32-B7C0-3FD972956D4B, path2=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/rb_1631215283_1630935291_594_E17D5544-7169-4D32-B7C0-3FD972956D4B. Temporally resolved by disabling the bucket: path=/products/data/xxxxxxxxx/splunk/db/_internaldb/db/DISABLED-rb_1631215283_1630935291_594_E17D5544-7169-4D32-B7C0-3FD972956D4B. Please check this disabled bucket for manual removal.\n

 

Thanks a lot,

Edoardo

Labels (4)
0 Karma
1 Solution

edoardo_vicendo
Contributor

@isoutamo thanks for your feedback. We couldn’t use any suite at storage level because we changed the number of LUNS seen by the OS (to increase performance). We tested moving the data thanks to Splunk Cluster facility of replicating buckets but it was taking too much time moving 8TB per Indexer, therefore we went with rsync.

I have written 2 guides:

  1. How to rsync data on an Indexer from one disk to another disk
  2. How to recover DISABLED buckets

Here the details:

How to rsync data on an Indexer from one disk to another disk

  1. With Splunk running, rsync the data from disk1 to disk2
  2. Once rsync finish put the Cluster in maintenance mode
  3. Stop Splunk on the Indexer
  4. Perform again the rsync to copy the remaining delta from disk1 to disk2 (but here use the –delete option)
  5. Verify data on disk1 are the same on disk2
  6. umount disk1 and point Splunk to disk2
  7. Restart Splunk on the Indexer
  8. Remove the Cluster from maintenance mode

 

Rsync procedure to move Splunk data from OLD to NEW disk

Example:
Splunk data are stored here SPLUNK_DB=/products/data/splunk
disk1 is mounted as /products/data/splunk
disk2 is mounted as /products/data/splunk2
script_01.sh perform the first big copy (and can be executed with Splunk running)
script_02.sh perform the second last copy (and have to be executed with Splunk NOT running and with the Cluster in maintenance mode)

1- On Splunk Indexer create the 2 scripts

in /opt/splunk/

script_01.sh
#!/bin/bash
time rsync -aP /products/data/splunk/ /products/data/splunk2/

script_02.sh
#!/bin/bash
time rsync -aP --delete /products/data/splunk/ /products/data/splunk2/

2- Procedure to run first script

#run script 01 without stopping splunk for the first sync
nohup /opt/splunk/script_01.sh 2>&1 &

3-Procedure to run second script

#Once finished first script check is finished with
ps -ef | grep rsync
check nohup.out

#check file system (space usage on disk2 should not increase anymore)
df -h
df -hm

#put the cluster in maintenance mode on Master Node
splunk enable maintenance-mode
splunk show maintenance-mode
splunk show cluster-status --verbose | head -20

#stop splunk on Indexer
./splunk stop

#run second script to perform the last copy in delta
nohup /opt/splunk/script_02.sh 2>&1 &

#Once finished second script check is finished with
ps -ef | grep rsync
check nohup.out

#check file system
df -h
df -hm

# Correct procedure to check filesystem01 is in sync with filesystem02

## list files and save the output
find /products/data/splunk/ -print > /opt/splunk/file_disk_01.txt
find /products/data/splunk2/ -print > /opt/splunk/file_disk_02.txt

## modify absolute path to then perform the comparison
vi file_disk_02.txt
:%s/splunk2/splunk/g
:wq

## sort the file to speed-up the diff command
sort file_disk_01.txt > file_disk_01_sorted.txt
sort file_disk_02.txt > file_disk_02_sorted.txt

## perform the diff (should not give any result if files are equal between disk1 and disk2)
diff file_disk_01_sorted.txt file_disk_02_sorted.txt

## also perform a md5 check (should give same hash)
md5sum file_disk_01_sorted.txt
md5sum file_disk_02_sorted.txt

splunk@myindexer:~ > md5sum file_disk_01_sorted.txt
31xx87c049fxx24b353xx8c45xx1b198  file_disk_01_sorted.txt
splunk@myindexer:~ > md5sum file_disk_02_sorted.txt
31xx87c049fxx24b353xx8c45xx1b198  file_disk_02_sorted.txt

#Before switch the disk and restart splunk check also if there are already some DISABLED bucket (just to avoid confusion if any will raise after splunk restart)
ls -la /products/data/splunk/db/*/db/DISABLED* | grep products
ls -la /products/data/splunk2/db/*/db/DISABLED* | grep products

#swicth disk (umount old disk, umount new disk, change mount point in /etc/fstab, mount new disk)
#restart splunk on Indexer
./splunk start

#check if DISABLED buckets are present (they should not be present unless you already have)
ls -la /products/data/splunk/db/*/db/DISABLED* | grep products

#Check Monitoring console on MN is OK
#Run some query over last 7 days on SH
index=_internal splunk_server=myindexer
index=_internal splunk_server=myindexer | stats count
index!=_* index=* splunk_server=myindexer
index!=_* index=* splunk_server=myindexer | stats count
index=_internal
index!=_* index=*
index=_internal | stats count
index!=_* index=* | stats count

#disable maintenance mode on Master Node
splunk disable maintenance-mode
splunk show maintenance-mode

#Run again some query over last 7 days on Search Head after Replication factor and Search Factor are met

 

 

How to recover DISABLED buckets

As indicated here it seems there are 9 digits available for the bucketID

https://community.splunk.com/t5/Deployment-Architecture/Max-Value-for-bucket-ID/m-p/76731

 

Example:
Splunk data are stored here SPLUNK_DB=/products/data/splunk

#Backup the DISABLED bucket
##list the folders
ls -la /products/data/splunk/db/*/db/DISABLED* | grep product

##create filelist.txt with the list of folders (remove trailing ":" from previous command if any)
tar -cvf /products/data/splunk/bckDisabled/bckbucket.tar -T /products/data/splunk/bckDisabled/filelist.txt

#Put cluster in maintenance mode on Master Node
splunk enable maintenance-mode
splunk show maintenance-mode

#Stop Splunk on your Indexer
./splunk stop

#Move the DISABLED folder into a non-disabled one increasing the BucketID (in this example from 105 to 100105)
mv /products/data/splunk/db/audit/db/DISABLED-db_1631192613_1630932002_105_3C08D28D-299A-448E-BD23-C0E9B071E694 /products/data/splunk/db/audit/db/db_1631192613_1630932002_100105_3C08D28D-299A-448E-BD23-C0E9B071E694

Note: Considering buckets are usually cancelled after retention time is reached, BucketID have to be higher enought that will never be reached in your environment.

#Restart Splunk on your Indexer
./splunk start

#Check data are searchable
index=_audit earliest=1630932002 latest=1631192613 splunk_server=myindexer

#Check bucket is searchable with a REST call

Check with REST
https://yourmasternode:8089/services/cluster/master/buckets/_audit~100105~3C08D28D-299A-448E-BD23-C0E9B071E694
you shoud see bucket searchable on your indexer

#Remove maintenance and check if replicated (once Search Factor and Replication Factor are met)

Check with REST
https://yourmasternode:8089/services/cluster/master/buckets/_audit~100105~3C08D28D-299A-448E-BD23-C0E9B071E694
you shoud see bucket searchable on more than one indexer (if you are on an Indexer Cluster) based on your SF and RF

 

 

Note: if you have DISABLED buckets on an Indexer Cluster they could be both db_* and rb_*

If you recover them all you could have duplicated data, but better having double than nothing.

Hope those guides will help you on planning a data migration or solving buckets conflicts.

 

Best Regards,

Edoardo

View solution in original post

edoardo_vicendo
Contributor

@isoutamo thanks for your feedback. We couldn’t use any suite at storage level because we changed the number of LUNS seen by the OS (to increase performance). We tested moving the data thanks to Splunk Cluster facility of replicating buckets but it was taking too much time moving 8TB per Indexer, therefore we went with rsync.

I have written 2 guides:

  1. How to rsync data on an Indexer from one disk to another disk
  2. How to recover DISABLED buckets

Here the details:

How to rsync data on an Indexer from one disk to another disk

  1. With Splunk running, rsync the data from disk1 to disk2
  2. Once rsync finish put the Cluster in maintenance mode
  3. Stop Splunk on the Indexer
  4. Perform again the rsync to copy the remaining delta from disk1 to disk2 (but here use the –delete option)
  5. Verify data on disk1 are the same on disk2
  6. umount disk1 and point Splunk to disk2
  7. Restart Splunk on the Indexer
  8. Remove the Cluster from maintenance mode

 

Rsync procedure to move Splunk data from OLD to NEW disk

Example:
Splunk data are stored here SPLUNK_DB=/products/data/splunk
disk1 is mounted as /products/data/splunk
disk2 is mounted as /products/data/splunk2
script_01.sh perform the first big copy (and can be executed with Splunk running)
script_02.sh perform the second last copy (and have to be executed with Splunk NOT running and with the Cluster in maintenance mode)

1- On Splunk Indexer create the 2 scripts

in /opt/splunk/

script_01.sh
#!/bin/bash
time rsync -aP /products/data/splunk/ /products/data/splunk2/

script_02.sh
#!/bin/bash
time rsync -aP --delete /products/data/splunk/ /products/data/splunk2/

2- Procedure to run first script

#run script 01 without stopping splunk for the first sync
nohup /opt/splunk/script_01.sh 2>&1 &

3-Procedure to run second script

#Once finished first script check is finished with
ps -ef | grep rsync
check nohup.out

#check file system (space usage on disk2 should not increase anymore)
df -h
df -hm

#put the cluster in maintenance mode on Master Node
splunk enable maintenance-mode
splunk show maintenance-mode
splunk show cluster-status --verbose | head -20

#stop splunk on Indexer
./splunk stop

#run second script to perform the last copy in delta
nohup /opt/splunk/script_02.sh 2>&1 &

#Once finished second script check is finished with
ps -ef | grep rsync
check nohup.out

#check file system
df -h
df -hm

# Correct procedure to check filesystem01 is in sync with filesystem02

## list files and save the output
find /products/data/splunk/ -print > /opt/splunk/file_disk_01.txt
find /products/data/splunk2/ -print > /opt/splunk/file_disk_02.txt

## modify absolute path to then perform the comparison
vi file_disk_02.txt
:%s/splunk2/splunk/g
:wq

## sort the file to speed-up the diff command
sort file_disk_01.txt > file_disk_01_sorted.txt
sort file_disk_02.txt > file_disk_02_sorted.txt

## perform the diff (should not give any result if files are equal between disk1 and disk2)
diff file_disk_01_sorted.txt file_disk_02_sorted.txt

## also perform a md5 check (should give same hash)
md5sum file_disk_01_sorted.txt
md5sum file_disk_02_sorted.txt

splunk@myindexer:~ > md5sum file_disk_01_sorted.txt
31xx87c049fxx24b353xx8c45xx1b198  file_disk_01_sorted.txt
splunk@myindexer:~ > md5sum file_disk_02_sorted.txt
31xx87c049fxx24b353xx8c45xx1b198  file_disk_02_sorted.txt

#Before switch the disk and restart splunk check also if there are already some DISABLED bucket (just to avoid confusion if any will raise after splunk restart)
ls -la /products/data/splunk/db/*/db/DISABLED* | grep products
ls -la /products/data/splunk2/db/*/db/DISABLED* | grep products

#swicth disk (umount old disk, umount new disk, change mount point in /etc/fstab, mount new disk)
#restart splunk on Indexer
./splunk start

#check if DISABLED buckets are present (they should not be present unless you already have)
ls -la /products/data/splunk/db/*/db/DISABLED* | grep products

#Check Monitoring console on MN is OK
#Run some query over last 7 days on SH
index=_internal splunk_server=myindexer
index=_internal splunk_server=myindexer | stats count
index!=_* index=* splunk_server=myindexer
index!=_* index=* splunk_server=myindexer | stats count
index=_internal
index!=_* index=*
index=_internal | stats count
index!=_* index=* | stats count

#disable maintenance mode on Master Node
splunk disable maintenance-mode
splunk show maintenance-mode

#Run again some query over last 7 days on Search Head after Replication factor and Search Factor are met

 

 

How to recover DISABLED buckets

As indicated here it seems there are 9 digits available for the bucketID

https://community.splunk.com/t5/Deployment-Architecture/Max-Value-for-bucket-ID/m-p/76731

 

Example:
Splunk data are stored here SPLUNK_DB=/products/data/splunk

#Backup the DISABLED bucket
##list the folders
ls -la /products/data/splunk/db/*/db/DISABLED* | grep product

##create filelist.txt with the list of folders (remove trailing ":" from previous command if any)
tar -cvf /products/data/splunk/bckDisabled/bckbucket.tar -T /products/data/splunk/bckDisabled/filelist.txt

#Put cluster in maintenance mode on Master Node
splunk enable maintenance-mode
splunk show maintenance-mode

#Stop Splunk on your Indexer
./splunk stop

#Move the DISABLED folder into a non-disabled one increasing the BucketID (in this example from 105 to 100105)
mv /products/data/splunk/db/audit/db/DISABLED-db_1631192613_1630932002_105_3C08D28D-299A-448E-BD23-C0E9B071E694 /products/data/splunk/db/audit/db/db_1631192613_1630932002_100105_3C08D28D-299A-448E-BD23-C0E9B071E694

Note: Considering buckets are usually cancelled after retention time is reached, BucketID have to be higher enought that will never be reached in your environment.

#Restart Splunk on your Indexer
./splunk start

#Check data are searchable
index=_audit earliest=1630932002 latest=1631192613 splunk_server=myindexer

#Check bucket is searchable with a REST call

Check with REST
https://yourmasternode:8089/services/cluster/master/buckets/_audit~100105~3C08D28D-299A-448E-BD23-C0E9B071E694
you shoud see bucket searchable on your indexer

#Remove maintenance and check if replicated (once Search Factor and Replication Factor are met)

Check with REST
https://yourmasternode:8089/services/cluster/master/buckets/_audit~100105~3C08D28D-299A-448E-BD23-C0E9B071E694
you shoud see bucket searchable on more than one indexer (if you are on an Indexer Cluster) based on your SF and RF

 

 

Note: if you have DISABLED buckets on an Indexer Cluster they could be both db_* and rb_*

If you recover them all you could have duplicated data, but better having double than nothing.

Hope those guides will help you on planning a data migration or solving buckets conflicts.

 

Best Regards,

Edoardo

isoutamo
SplunkTrust
SplunkTrust

Hi

1st I haven't done this fix on (multisite) cluster. 

Basically your analyse was quite correct. And how you should do on next time to avoid this?

The easiest way was use LVM on linux and then just add new disks to VG and then extend needed filesystem on disk. IMHO never use splunk without LVM! And also use splunk's volumes instead of point indexes to SPLUNK_DB/xxx. With those two you could avoid lot of issues.

Another way was add new node with above disk configuration and add it to cluster and then remove old node.

Last one was:

  1. rsync live system
  2. update cluster timeouts for check if peers are up enough long (depends how much you have buckets and how fast those are restarting)
  3. put cluster into maintenance mode
  4. stop peer node
  5. resync disk with delete removed files to get live situation back with new disk
  6. remove disk/point new disk to correct path
  7. restart splunk
  8. remove cluster from maint mode

Then to your questions.

I cannot said for sure, how many digits there can be (and actually this changes by versions if I'm right), but at least 5-6 digits should be ok. I'm not sure if splunk start you changed bucket id as a new starting point on mlsc for bucketID. What I have seen is that there could be same counting bucketID, but as full bucket ID contains also those node GUID etc. then it's not issue to have same id on several nodes.

As I said not in production clusters. I have played with this on sandbox.

I'm quite sure that splunk didn't replicate those buckets as those are "old buckets". Splunk start replications as new buckets has created not for old ones.

r. Ismo

Get Updates on the Splunk Community!

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

🍂 Fall into November with a fresh lineup of Community Office Hours, Tech Talks, and Webinars we’ve ...

Transform your security operations with Splunk Enterprise Security

Hi Splunk Community, Splunk Platform has set a great foundation for your security operations. With the ...

Splunk Admins and App Developers | Earn a $35 gift card!

Splunk, in collaboration with ESG (Enterprise Strategy Group) by TechTarget, is excited to announce a ...