Knowledge Management

Smartstore: How to upload buckets manually to remote store after the migration of indexer is over?

rbal_splunk
Splunk Employee
Splunk Employee

We have a few different requirements.
i)Upload multiple (buckets)TB of legacy Standalone buckets to the index that is already migrated to the remote store.
ii)Upload a few legacy Standalone bucket to an index after it has already migrated.

Labels (1)
Tags (1)
0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

Before answering this question we need to understand cachemanager_upload.json.

This file resides in SPLUNK_HOME/var/run/splunk/cachemanager_upload.json, and this is used to migrate bucket to smart store.

the sample below show the list of the bucket to be uploaded to the remote store.

cat ./var/run/splunk/cachemanager_upload.json | sed 's/,/\n/g'
{"bucket_ids":["bid|_audit~100~761A77A2-6676-4BF9-83CD-1CB243ED61BF|"
"bid|_audit~103~EDEAC3E5-E0B3-45B9-84B3-A1E087035148|"
"bid|_audit~104~761A77A2-6676-4BF9-83CD-1CB243ED61BF|"
"bid|_audit~108~EDEAC3E5-E0B3-45B9-84B3-A1E087035148|"
"bid|_audit~110~761A77A2-6676-4BF9-83CD-1CB243ED61BF|"
"bid|_audit~124~761A77A2-6676-4BF9-83CD-1CB243ED61BF|"

AS part of the migration, DDM(DatabaseDirectoryManager) pre-registers the buckets to cache manager via bulk register i.e. it writes the buckets to cachemanager_upload.json at "$SPLUNK_HOME/var/run/splunk/". This file maintains the list of buckets that need to be uploaded to remote storage. Whenever we need to upload a bucket, we flush the bid to this file, so that in scenarios where splunk crashed or got restarted before the upload, we would resume the upload process from where we left off and after the upload is done, we would remove the entry from this file, so that in-memory state is in sync with state on the disk.

This file can also be manually updated to upload the bucket to remote using bulk_reigter.Hit caches man's endpoint for a bucket requesting it to add the bucket to cachemanager_upload.json.

curl -k -u <user>:<passwd> -X POST https://<uri>/services/admin/cacheman/_bulk_register -d cache_id="<cacheId>"

example:

 curl -k -u admin:changeme -X POST https://localhost:10041/services/admin/cacheman/_bulk_register -d cache_id="bid|taktaklog~1~F7770FEB-F5A6-4846-A0BB-DDC05126BBF6|"

Here is an example to upload multiple buckets

curl --netrc-file nnfo -k -X POST https://localhost:8089/services/admin/cacheman/_bulk_register  -d cache_id="bid|nccihub~17~024011E7-E61E-45CE-82DE-732038D5C276|" -d cache_id="bid|nccihub~22~024011E7-E61E-45CE-82DE-732038D5C276|" -d cache_id="bid|nccihub~28~024011E7-E61E-45CE-82DE-732038D5C276|" -d cache_id="bid|nccihub~34~024011E7-E61E-45CE-82DE-732038D5C276|" -d cache_id="bid|nccihub~12~E646664A-D351-41E4-BBE7-5B02A08C44C9|" -d cache_id="bid|nccihub~17~F876C294-3E3E-488A-8344-16727AC34C52|" -d cache_id="bid|nccihub~17~E646664A-D351-41E4-BBE7-5B02A08C44C9|"

When the bulk register rest endpoint is called it add the bucket to cachemanager_upload.json and the subsequent restart of the indexer would upload the bucket to the remote store.

You may hit a scenario where the customer is planning to bring multiple TB of the legacy standalone bucket to Smarstore cluster deployment.

When the indexer is first enabled for the smart store, the bucket would migrate to the remote store. When migration is finished Splunk would create $SPLUNK_HOME/var/lib/splunk//db/.buckets_synced_to_remote_storage , and any newly added bucket would not get uploaded.

For the requirement to add "multiple TB of the legacy standalone bucket to Smarstore cluster deployment", we should use migration and not bulk_register. In this case, you can remove $SPLUNK_HOME/var/lib/splunk//db/.buckets_synced_to_remote_storage and restart the indexer and the bucket will be re-uploaded.

gjanders
SplunkTrust
SplunkTrust

While I used:

"For the requirement to add "multiple TB of the legacy standalone bucket to Smarstore cluster deployment", we should use migration and not bulk_register. In this case, you can remove $SPLUNK_HOME/var/lib/splunk//db/.buckets_synced_to_remote_storage and restart the indexer and the bucket will be re-uploaded."

 

My only recommendation is that if you migrate the buckets from a single-site indexer cluster to a multi-site indexer cluster you consider changing the (old) cluster to multi-site pre-migration.

I hit a strange issue months later where as the buckets froze the cluster manager/master node kept querying smartstore for the buckets triggering 404 errors.

The issue disappeared on restart of the cluster master but it happened for any buckets that were migrated from the single site cluster using the above method, so I'm wondering if the fact that the buckets were not multi-site were part of the issue...

Another alternative to a restart would be (as mentioned in the support case):

curl -k -u admin:<password> -X POST "https://<cm>:<mgmt_port>/services/cluster/master/buckets/<bucket_id>/remove_all"

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...