Knowledge Management
Highlighted

[Smartstore]Can I get more information on Migrating the data from Splunk Local store to remote storage?

Splunk Employee
Splunk Employee

I am trying to migrate date from local storage to remote store and would like to understand best way to monitor the progress.

Tags (1)
0 Karma
Highlighted

Re: [Smartstore]Can I get more information on Migrating the data from Splunk Local store to remote storage?

Splunk Employee
Splunk Employee

The migration from local storage to remote store ( like S3) will start when Cluster bundle with configuration from remote store is deployed from cluster master to Cluster Peer. The migration itself will happened on indexers . During migration the peers will upload all the searchable copied to remote store. When Multiple peer opload the same copy of bucket to remote-store, only one copy will remain and get uploaded.

Once the migration is complete on indexer it will not be attempted again( If need it can be manually triggered). Below are the sample searches you may use to look aspect of migration process.

1)Tracing start of the migration. ( splunkd.log component: DatabaseDirectoryManager has one entry per index)

SPL
index=_internal source="splunkd.log" DatabaseDirectoryManager "Remote storage migration needed" | timechart count by idx

Sample Event:
11-21-2018 06:38:20.514 +0000 INFO DatabaseDirectoryManager - Remote storage migration needed for idx=main for a bucket count=34

This event has the index name and the count of buckets to be migrated.

Screen Shot:
alt text

2)To track end of migration ( it’s for all indexes )

SPL
index=_internal source="splunkd.log" component=CacheManager "Remote storage migration" completed

Sample Event:
11-21-2018 06:38:28.957 +0000 INFO CacheManager - Remote storage migration of buckets and summaries completed (durationsec=8 uploadjobs=67)

Screen Shot:
alt text

Note : you can compare that upload_jobs to match with the Total sum of the count for each index

3) Here is a SPL that can also be used to see the progress of the migration ,but it has some limitation

| rest /services/admin/cacheman/_metrics splunk_server=<INDEXERS>
 | rename migration.total_jobs AS migration_jobs_total,migration.current_job AS migration_jobs_complete
| eval migration_jobs_remaining=migration_jobs_total-migration_jobs_complete
| fillnull migration.end_epoch value="-"
| stats count by splunk_server migration.start_epoch migration.end_epoch migration.status migration_jobs_total migration_jobs_complete migration_jobs_remaining
| eval percent_complete = round((migration_jobs_complete/migration_jobs_total)*100,1)
| eval current_time_secs=now()
| eval time_elapsed_secs=if('migration.status'="finished",('migration.end_epoch'- 'migration.start_epoch'),(current_time_secs - 'migration.start_epoch'))
| eval secs_per_job=time_elapsed_secs/migration_jobs_complete
| eval time_remaining_secs=migration_jobs_remaining*secs_per_job
| eval seconds_per_job=round((secs_per_job),2)
 | convert timeformat="%+" ctime(migration.start_epoch) AS migration_start_time
| convert timeformat="%+" ctime(migration.end_epoch) AS migration_end_time
| eval migration_end_time=if('migration.status'="finished",migration_end_time,"-")
| convert timeformat="%+" ctime(current_time_secs) AS current_time
| eval current_time=if('migration.status'="finished","-",current_time)
| eval time_elapsed_hours=round(time_elapsed_secs/3600,2)
| eval time_remaining_hours=round((time_remaining_secs/3600),2)
| table splunk_server migration.status migration_start_time migration_end_time current_time migration_jobs_total migration_jobs_complete migration_jobs_remaining percent_complete time_elapsed_hours time_remaining_hours seconds_per_job 

The above search is sometime misleading, for example in case the indexer crashes/shutdown, the search will show finished as 100%.

Screenshot:

4)Upload Operation can be monitored :

SPL
index=internal source=/metrics.log TERM(group=cachemgrupload) | timechart span=1s sum(queued) AS queued, sum(succeeded) AS succeeded by host

Sample Event
10-25-2018 10:48:06.599 +0000 INFO Metrics - group=cachemgrupload, elapsedms=17017, kb=124372, succeeded=1

5) Upload speed

SPL:
index=audit ( action=localbucketupload AND ( sourcetype=audittrail )) | eval elapseds=elapsedms/1000 | eval kbps = kb/elapseds |eval mbps=kbps/1024 | timechart span=1s max(mbps) by host

Sample Event :

Audit:[timestamp=10-25-2018 10:47:37.615, user=n/a, action=localbucketupload, info=completed, cacheid="bid|internal~40~C3912E39-C49C-4A24-B119-AA4B13C0F3F1|", localdir="/home/splunker/splunk/var/lib/splunk/internaldb/db/db1540464387154046158940C3912E39-C49C-4A24-B119-AA4B13C0F3F1", kb=124372, elapsed_ms=17017][n/a]

6)Role of file bucketssyncedtoremotestorage in migration:

find . -type f -name .buckets_synced_to_remote_storage
./var/lib/splunk/audit/db/.buckets_synced_to_remote_storage
./var/lib/splunk/_internaldb/db/.buckets_synced_to_remote_storage
./var/lib/splunk/_introspection/db/.buckets_synced_to_remote_storage
./var/lib/splunk/_telemetry/db/.buckets_synced_to_remote_storage
./var/lib/splunk/fishbucket/db/.buckets_synced_to_remote_storage
./var/lib/splunk/historydb/db/.buckets_synced_to_remote_storage
./var/lib/splunk/defaultdb/db/.buckets_synced_to_remote_storage
./var/lib/splunk/summarydb/db/.buckets_synced_to_remote_storage

At start-up, if an index is S2-enabled, we check to see if buckets need to be uploaded. To check if buckets need to be uploaded we look if file $homePath/.bucketssyncedtoremotestorage exists. The presence of this file indicates that we don't need to upload files to the remote storage and therefore no migration needs to happen.

7) Here is another search to confirm migration of indexers.

./splunk search "|rest /services/admin/cacheman |search cm:bucket.stable=0 |stats count" # should return zero

0 Karma
Highlighted

Re: [Smartstore]Can I get more information on Migrating the data from Splunk Local store to remote storage?

SplunkTrust
SplunkTrust

Can you correct the typo for "understane" please?
Also perhaps you can accept your own answers so they are marked as closed?

I appreciate the question/answer format, hopefully some of these queries are fed back into the monitoring console...

0 Karma