I'm trying to set up some new indexes, and then best-practices way, and I'm a little confused on how to accomplish what I want.
I used to just make an index, define the maxTotalDataSizeMB (for the entire index) and move on. I never really cared much before. But now, I'm understanding that may not have been the best scenario. I don't know that I ever truly controlled how much data was in the hot/warm bucket vs cold (unsearchable). So how would I better do this now? Running v7.2.4 Splunk Enterprise now.
I want a 5GB searchable index (so hot/warm), but another 10GB set aside for cold which will be on another mount (spinning disc) and no frozen at all. Do I still have to define a thaweddb too? I was hoping to review my "original" vs "new" methods.
Am I missing anything??
homePath = $SPLUNK_DB/$_index_name/db coldPath = $SPLUNK_DB/$_index_name/colddb thawedPath = $SPLUNK_DB/$_index_name/thaweddb maxTotalDataSizeMB = 15000 repFactor = auto
homePath = /opt/utils/splunk/new/db homePath.maxDataSizeMB = 5000 coldPath = /opt/slowdisc/splunk/new/colddb coldPath.maxDataSizeMB = 10000 thawedPath = /opt/utils/splunk/new/thaweddb repFactor = auto
Cold data is still searchable.
The idea for cold is that you can put it on slower (cheaper) storage because your users are less likely to need to search it, however its still fully searchable (albeit slower than hot/warm).
Data is only unsearchable once its 'frozen' - If you don't do anything to manage your frozen data, it will simply be deleted when it hits your storage/date limits, but setting a frozen path means you can manage archiving that data any way you see fit. You can:
a.) leave it in the frozen folder
b.) rsync it somewhere else, and then delete it
c.) back it up to tape (ha!) and then delete it
d.) use a frozen script to do something else with it (like move it to S3)
Assuming you somehow manage your frozen data, if ever you need to re-import it, only then, do you need to configure a thawed directory.
Thawed is for data which has been previously frozen, simply copy your old frozen data here, and ask Splunk to unthaw it to make it searchable again.
Your high level approach looks sound (but I recommend reading the index documentation fully to understand all the options.
Thank you for the additional info! Always good to have a clear understanding of things. Never thought of moving frozen over to the cloud (s3) which may be an interesting solution down the road. I am definitely going to better manage the hot/warm and cold sizes in the index now. So I know how much I'm holding and where.
Appreciate the info.