I just needed some help from Splunk regarding a request from our clients. So, a client is migrating from Splunk to Sentinel but has about 25 TBs of data still on Splunk cloud which they want to keep for at least a year. The data should be readable for investigations and compliance purposes.
I know the client might need Splunk professional services for all options mentioned above since it's Splunk Cloud but what would be the best and most cost-effective solution for them? Can you please help and advise what could be the best way forward.
If you enables Dynamic Data Self-Storage (DDSS) to export your aged ingested data, the oldest data is moved to the Amazon S3 account in the same region as their Splunk Cloud deployment before it is deleted from the index. You are responsible for AWS payments for the use of the Amazon S3 account. When data is deleted
from the index, it is no longer searchable by Splunk Cloud.
Customers are responsible for managing DDSS and a non Splunk Cloud stack for searching archived data. This is a manual process and customers will require a professional services engagement.
https://docs.splunk.com/Documentation/SplunkCloud/latest/Admin/DataSelfStorage
NOTE:
DDSS Data Egress - No limit - Export 1 TB/hr; Must be in the same region as the indexing tier
https://www.splunk.com/en_us/blog/platform/dynamic-data-data-retention-options-in-splunk-cloud.html
Professional Services is not required to configure or use DDSS.
If you are moving away from Splunk Cloud then a customer can setup DDSS using an sc_admin account via the Web UI and/or ACS and then configure their indexes to use the DDSS location.
To migrate away from Splunk Cloud the customer will then need to reduce the retention on these indexes which will trigger existing buckets from DDAS (Active Searchable) to roll to "frozen" (DDSS).
At this point the buckets in S3 are the same as any other frozen bucket from Splunk Enterprise or Splunk Cloud and can be thawed. (see https://docs.splunk.com/Documentation/Splunk/9.4.2/Indexer/Restorearchiveddata)
If only the raw data is required then this can be extracted from the journal.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
According to the Splunk Cloud Overview Technical Enablement, Splunk recommends engaging Professional Services.. @livehybrid
I challenge that this is either incorrect or missing some context. I appreciate that this is the sort of thing PS get involved with but I know a number of customers who have managed this themselves, as once it is in DDSS it isnt much different to a standard thaw process.
Infact the process is detailed in the public docs (Restore indexed data from a self-storage location) with a step-by-step process which does not reference requirement for PS.
I created a script to convert DDSS to SmartStore for a customer who wanted a small on-prem SH to be able to access old data which you might find useful https://github.com/livehybrid/ddss-restore
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
You're absolutely right that the public documentation (including the Restore indexed data from a self-storage location guide) outlines the DDSS process in detail, and it is technically possible for customers to manage this independently, especially those with in-house Splunk expertise.
Hi @ssuluguri
You mention that the customer wants their data to be readable after moving off Splunk Cloud, does this mean it would need to be in raw format?
The easiest way to get data out of Splunk Cloud from my experience is to use Dynamic Data: Self-Storage (DDSS) - storing frozen buckets into the customer's S3 bucket. Once here you can do a number of things with it:
1) Thaw it out into a Splunk instance with mininal/free license (you wont be ingesting new data)
2) Extract the journal file from the DDSS buckets leaving you with the raw data.
Would the customer be willing to have a small Splunk instance with their archived data in for easy searching?
If it helps, Ive got a repo at https://github.com/livehybrid/ddss-restore which is primarily for converting DDSS back into SmartStore buckets for use with a semi-offline (in-case of emergencies-style) data storage.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Thank you for your advice