Recently setup SmartStore with a test index and sending data to S3. It's working perfectly, but I have questions about the warm to frozen and archiving.
In the following splunk doc, it says hot buckets roll to warm buckets and get uploaded on S3 which is great, but doesn't say they can or can't be held there indefinitely. It then says, "Buckets roll to frozen directly from warm.", but doesn't say anything else about it. If buckets can go to S3 and never get rolled to frozen, that's okay, but if it rolls to frozen without me giving it a reason to roll and be deleted, that's something I need to avoid.
https://docs.splunk.com/Documentation/Splunk/7.3.2/Indexer/SmartStoreindexing#Bucket_states_and_SmartStore
After buckets roll to warm and go to the S3 bucket, if no settings for freezing are configured, will Splunk automatically roll the buckets to frozen after a while?
If the warm buckets in the s3 bucket do not get rolled over to frozen and no archiving is set up, will the data in S3 always remain as warm buckets and will that have any issues, besides long searches?
Splunk docs say the coldToFrozenScript can be used and I've tried setting it up so that it when warm buckets in s3 get rolled to frozen, it would use that script to take colddb buckets and archive them to another S3 bucket, but because they do not roll to cold on the local server, nothing gets archived. Trying to get the script to work with Splunk, S3, and cache manager doesn't seem to work. Is there a process or script for archiving smartstore warm buckets to another S3 bucket without having to archive locally on the indexer?
Update 10/17/2019
I've made some changes to the script and tried testing some things out. Switched from python3 to splunks internal python2.7 and adjusted my code to ensure that splunk was running the script with it's own binaries. Splunk states the coldToFrozenScript and be used for SmartStore indexes, but as to what capacity, I'm not sure.
If I run the script manually on a bucket that exists locally on the server (not in S3), the script runs just fine and the bucket gets copied to my S3 endpoint for archiving. I do this by running "/opt/splunk/bin/python2.7 /opt/splunk/etc/slave-apps/ColdToFrozenS3/bin/coldToFrozenS3.py ". It does exactly what I want with local buckets, but I'm trying to get this to run for when SmartStore moves warm s3 buckets to frozen.
So, I deployed the script back to the indexers and when Splunk tries to freeze a smartstore bucket and uses the coldToFrozenScript, I get the errors below. It looks like splunk is trying to retrieve the bucket through cache manager and failing for some odd reason or it is trying to initiate the script before the bucket is pulled from S3, or something else I'm not aware of.
One of the entries shows a 404 error which doesn't make sense as the servers are able to read and write from both S3 buckets and just for testing purposes, I've given their roles full access. Manually downloading and uploading from each indexer to each S3 bucket works fine, so not sure why the 404 is occurring.
10-17-2019 10:42:33.859 -0400 ERROR DatabaseDirectoryManager - failed to open bucket/wait for bucket to be local through CacheManager, cid="bid|smartstore_test~14~044B61B4-FEA8-4CC9-BEA9-C694C082BECA|", exception=localize operation failed for cacheId="bid|smartstore_test~14~044B61B4-FEA8-4CC9-BEA9-C694C082BECA|"
host = <indexer.hostname.local>
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
10-17-2019 10:42:33.928 -0400 ERROR RetryableClientTransaction - transactionDone(): transactionId=0x7f912e037000 rTxnId=0x7f90c6ffe0f0 success=N HTTP-statusCode=404 HTTP-statusDescription=Not Found retry=N no_retry_reason="transaction had fatal error"
host = <indexer.hostname.local>
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
10-17-2019 10:42:33.928 -0400 WARN BucketMover - RemoteStorageAsyncFreezer freeze failed for bid=smartstore_test~14~044B61B4-FEA8-4CC9-BEA9-C694C082BECA since coldToFrozenScript="/opt/splunk/bin/python2.7" "/opt/splunk/etc/slave-apps/ColdToFrozenS3/bin/coldToFrozenS3.py" could not be run due to exception=std::exception
host = <indexer.hostname.local>
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
... View more