Deployment Architecture

Wrong retention caused buckets to freeze in cluster- Need to restore frozen data

sat94541
Communicator

What happen is we were changining index cold volume..The change the size of warm db and force bucket to roll to cold
The buckets roll to cold and now they we can't search the old data.
The buckets are in cold db and also can be search by running the search on the search peers
and the bucket is showed the flag is in frozen. How to remove the frozen flag

0 Karma

scheng_splunk
Splunk Employee
Splunk Employee

Detailed steps with correction on some syntax:

WARNING - make sure you have a full backup before going forward.

Step 1 - place CM in maintenance-mode
./splunk enable maintenance-mode

Step 2 - On each peer:

  1. stop splunkd
  2. go to each index directory (db and colddb):
    2.1 grep -Rle '1$' --include bucket_info.csv . | xargs sed -i 's/1$/0/g'
    2.2 mv .bucketManifest /tmp/.bucketManifest //backup bucket manifest file for each index

  3. repeat step 2 till all indexes done

4. start splunkd

Step 3 - restart CM <----------------IMPORTANT: this will allow the latest cluster frozen flag to be updated on CM.
Step 4 - confirm all peers "UP" then disable maintenance-mode
./splunk disable maintenance-mode
Step 5 - confirm buckets being replicated through GUI

==============================================
P.S. Additional script to do the job automatically through all buckets:

#! /bin/ksh
for CSV in $(grep -Rle '1$' --include bucket_info.csv . )
do
sed 's/1$/0/g' $CSV  > /tmp/HOLD.out
cat /tmp/HOLD.out > $CSV
DIR_NAME=$(dirname $(dirname $CSV))
if [ -f ${DIR_NAME}/.bucketManifest ]
  then
     Index_Name=$(print $DIR_NAME | awk -F "/" '{print $2}')
     print " ${DIR_NAME}/.bucketManifest /tmp/${Index_Name}.bucketManifest"
     mv ${DIR_NAME}/.bucketManifest  /tmp/${Index_Name}.bucketManifest
fi
done

rbal_splunk
Splunk Employee
Splunk Employee
  1. Set CM in Maintenance mode. stop all CP's, Stop CM.

./splunk enable maintenance-mode
./splunk stop.

2.On each CP cd into directory: ./splunk/var/lib/splunk//db
3. Run command

grep -Rle '1$' --include bucket_info.csv

Verifies these are the only buckets marked as frozen
example: "indextime_et","indextime_lt","frozen_in_cluster"
1484368345,1484385061,1

  1. Run command ( to unfreeze the bucket)
    grep -Rle '1$' --include bucket_info.csv | xargs sed -i 's/1$/0/g'

    Locates frozen buckets and replace the 1 (frozen=true) to 0 (frozen=false)

  2. Run same steps (2-4) for colddb directory: : ./splunk/var/lib/splunk//colddb

  3. Locate .bucketManifest file by running ls -la within ./splunk/var/lib/splunk/ directory.

  4. Move .bucketManifest file outside of splunk into tmp directory, during startup the file will regenerate with new information.

  5. Started up CM, set in Maintenance, start up CPs, take CM out of Maintenance:
    ./splunk disable maintenance-mode
    ./splunk show maintenance-mode

Get Updates on the Splunk Community!

Technical Workshop Series: Splunk Data Management and SPL2 | Register here!

Hey, Splunk Community! Ready to take your data management skills to the next level? Join us for a 3-part ...

Spotting Financial Fraud in the Haystack: A Guide to Behavioral Analytics with Splunk

In today's digital financial ecosystem, security teams face an unprecedented challenge. The sheer volume of ...

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability As businesses scale ...