If we are using AWS smart store for all our splunk data, and we set the recency/no evict to some number (let’s say a week), and then we turn around and do a search for data a year old, does that data get brought back on-prem for searching, or does the search actually take place in AWS s3 buckets directly? I would assume that if it’s the former, we’d need to have a decent sized buffer of local storage for when it pulls that data back.
Yes we will pull data back from S3->cache storage in order to serve search. Sizing the cache for your regular search load is important, as is minimizing searches that span a large timerange against the raw events.
Running something like an alltime search can be slow as we need to pull all the buckets from s3, search against this data, and then remove these recently downloaded bucket (as well as other recent buckets) to make room for more as the search goes on.
when searched for data out of (hotlistrecencysecs period ) the indexer will get the required warm/cold buckets from remote storage and stores in local cache , so it is recommended to have SSD's for your local storage for better cache management and performance.