Getting Data In
Highlighted

Does it make sense to use SmartStore for all data except hot and warm?

Motivator

We wonder about using SmartStore. Does it make sense to use it for all data except hot and warm data? Even if we end up with all data in SmartStore, it seems to be a good start to use it first for the older data.

And there is a note at Archive indexed data saying:

-- Although SmartStore indexes do not usually contain cold buckets, you still use the attributes described here (coldToFrozenDir and coldToFrozenScript) to archive SmartStore buckets as they roll directly from warm to frozen. See Configure data retention for SmartStore indexes.

What does it mean?

Highlighted

Re: Does it make sense to use SmartStore for all data except hot and warm?

Esteemed Legend

The bucket life-cycle changes completely with SmartStore. Probably the best topic online anywhere about those nuances is here (read both answers and all of the comments):
https://answers.splunk.com/answers/739051/smartstore-behaviors.html

Highlighted

Re: Does it make sense to use SmartStore for all data except hot and warm?

Motivator

So, @davidpaper says -

-- RF/SF only apply to Hot buckets. Once a bucket is rolled, it is uploaded to S3 and any bucket replicates are marked for eviction.

The way I read it, within the SmartStore paradigm, hot buckets are created by Splunk in the conventional way and from that point they go to S3, where they are no warm/cold/frozen boundaries. Do I make sense?

0 Karma
Highlighted

Re: Does it make sense to use SmartStore for all data except hot and warm?

Esteemed Legend

That is how I see it: hot buckets do not change at all, warm buckets change to smartstore, and cold is only for very tiny metadata and perhaps for temporary local cache (not sure where that actually lives) but otherwise is a completely dead concept.

Highlighted

Re: Does it make sense to use SmartStore for all data except hot and warm?

Splunk Employee
Splunk Employee

Yep. I changed the way I think of the bucket life cycle with SmartStore.

Hot is read/write and replicated just like non-SmartStore. Once they roll to read only, they aren't "warm" or "cold" to me anymore, they are just read only as they are copied to the remote object store. Once a bucket exists on the remote object store, we only download bits and pieces, not necessarily the whole bucket, when it is time to search it. Once in the remote object store, there are no warm, cold, or thawed boundaries anymore. Freezing of buckets still exists, which deletes them (by default).

Highlighted

Re: Does it make sense to use SmartStore for all data except hot and warm?

Motivator

Pretty mazing @dpaper!

0 Karma
Highlighted

Re: Does it make sense to use SmartStore for all data except hot and warm?

Path Finder

As said, duplicate copies based on SF/RF is applicable only for hot buckets. As soon as it moves to warm bucket, a copy is being sent to S3 buckets & will be evicted from the local cache only if it meets eviction policy criteria.

And when we use smartstore, there is no duplication of warm buckets in local cache, So we got enough storage to hold more data.

With respect to moving data to frozen. Please find below the link.

Cache manager will download the data from S3, upload to frozen directory and removes the buckets in local cache and S3.

https://answers.splunk.com/answers/777620/splunk-smartstore-do-warm-buckets-need-to-roll-to-1.html

0 Karma