So here is my understanding and the way that I've got our on-prem instance configured. hot buckets are stored on a local flash array. When the bucket closes, it keeps the closed bucket on the flash drive and writes a copy to the S3 storage. The S3 storage copy is considered to be the 'master copy'. I try not to use the term 'warm bucket', but instead use 'cached bucket'. All searches are performed on either hot or cached buckets on the local flash array. Cached buckets are eligible for eviction from local storage by the cache manager. So if your search needs a bucket that is not on the local storage, it will evict eligible cached buckets, retrieve the buckets from S3 storage and then perform the search. The frozenTimePeriod defines our overall retention time. We use hotlist_recency_secs to define when a cached bucket is eligible for eviction. That is. buckets less than the hotlist_recency_secs age are not eligible for eviction. Our statistics show that probably 90% of the queries have a time span of 7 days or less (research gosplunk.com for query). Thus, by setting the hotlist_recency_sec to 14 days, we are ensured that the search buckets are on local, searchable storage w/o having to reach out to the S3 storage (which is slower). One last thing. We need a 1 yr searchable retention. However, we also need to keep 30 months total retention. To accomplish this, I use ingest actions to the S3 storage. Ingest actions will write the events in compressed json format by year, months, day, and sourcetype. Hope this helps.
... View more