Knowledge Management

AWS Smart Store - how is data searched in s3 vs local?

jtm7x2
Explorer

If we are using AWS smart store for all our splunk data, and we set the recency/no evict to some number (let’s say a week), and then we turn around and do a search for data a year old, does that data get brought back on-prem for searching, or does the search actually take place in AWS s3 buckets directly? I would assume that if it’s the former, we’d need to have a decent sized buffer of local storage for when it pulls that data back.

0 Karma
1 Solution

dxu_splunk
Splunk Employee
Splunk Employee

Yes we will pull data back from S3->cache storage in order to serve search. Sizing the cache for your regular search load is important, as is minimizing searches that span a large timerange against the raw events.

Running something like an alltime search can be slow as we need to pull all the buckets from s3, search against this data, and then remove these recently downloaded bucket (as well as other recent buckets) to make room for more as the search goes on.

View solution in original post

0 Karma

saiganesh49
Explorer

when searched for data out of (hotlist_recency_secs period ) the indexer will get the required warm/cold buckets from remote storage and stores in local cache , so it is recommended to have SSD's for your local storage for better cache management and performance.

0 Karma

dxu_splunk
Splunk Employee
Splunk Employee

Yes we will pull data back from S3->cache storage in order to serve search. Sizing the cache for your regular search load is important, as is minimizing searches that span a large timerange against the raw events.

Running something like an alltime search can be slow as we need to pull all the buckets from s3, search against this data, and then remove these recently downloaded bucket (as well as other recent buckets) to make room for more as the search goes on.

0 Karma
Get Updates on the Splunk Community!

Dashboard Studio Challenge - Learn New Tricks, Showcase Your Skills, and Win Prizes!

Reimagine what you can do with your dashboards. Dashboard Studio is Splunk’s newest dashboard builder to ...

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Take the 2021 Splunk Career Survey for $50 in Amazon Cash

Help us learn about how Splunk has impacted your career by taking the 2021 Splunk Career Survey. Last year’s ...