topic Re: SmartStore Behaviors in Knowledge Management

SmartStore Behaviors

davidpaper — Mon, 08 Apr 2019 22:46:20 GMT

I'd like to better understand what behaviors SmartStore is going to exhibit in my environment, and how do I manage them? What can I do to prepare my environment for SmartStore?

Re: SmartStore Behaviors

davidpaper — Thu, 13 May 2021 03:59:10 GMT

S2 behaviors in no particular order. I will update this post as new information is learned.

RF/SF only apply to Hot buckets. Once a bucket is rolled, it is uploaded to S3 and any bucket replicates are marked for eviction.
S2 cachemanager will download components of a bucket as searches determine what’s needed. Maybe bloomfilters, deletes, journal.* or other components, and as such multiple downloads for the same bucket may look like they are happening, but per component, no duplicate downloads should happen.
Evictions don’t always seem to show up in MC on the S2 pages. The following will.
index=_internal sourcetype=splunkd source=*splunkd.log action=evictDeletes
Starting in 7.2.4, additional metrics were added to be able to count downloaded byte count. Prior to this version, Splunk was metrics-blind to the (potentially significant) impact on the network/storage a rolling restart induces.
During a rolling restart, as each indexer is marked to go down
CM begins to reassign primacy for buckets on the indexer on the way down to other indexers
All buckets on indexer being restarted are marked for eviction, effectively flushing the cache on the indexer being restarted
As indexers in the cluster are restarted, others will start d/ling buckets from S3 to satisfy search requests, which can take a heavy toll on local network and storage if not prepared for this level of data transfer in a short period of time, as all other indexers not being restarted will likely start requesting buckets to download at once.
SmartStore only allows one indexer at a time to be primary searchable for a bucket and no other indexers are allowed to have copies of that bucket cached. The CM will issue eviction notices to any indexers with copies of that bucket locally. This ensures that only 1 indexer will search that bucket and return results. As a result of this, there is a huge amount of data shuffling and downloading that happens during a full cluster rolling restart.
Bucket rebalance works more quickly with S2 than without it because the only buckets to rebalance are hot buckets

Added Nov 2019

Disk part 1: S2 disk I/O requirements seem to be higher than non-S2, due to the bucket downloading process needing to be able to write large amounts of data quickly as cachemanager populates buckets for search. Default downloading config allows for 8 simultaneous downloads at once. Disks previously able to shoulder the load may not be up to the task of S2’s caching requirements. I'm looking at you, RAID5 volumes. By definition it's cache space (and hot bucket space, but hot is replicated), so use RAID0 (stripe) for the fastest disk possible, and not waste a MB of available disk space. RAID10 (mirrored stripes) is also acceptable, but cuts usable disk space by 50%.
Disk part 2: To expand on the above a bit, S2 performance is more than just high IOPS, it's about throughput too. Customers running S2 in AWS that have chosen to use gp2 EBS volumes for hot/cachemanager will likely see severe IO contention resulting in IO wait % jumping during high periods of S2 bucket downloads from remote storage. This is quite easy to see in top or iostat when users run searches that trigger large bucket evictions & bucket downloads from remote storage. gp2 has a limit of 250MB/sec, which doesn't take long to hit when the network is 10 gig or faster. Yes, a fast network means data written to kernel buffer cache at a high rate and when its time to sync to disk, the storage won't be able to keep up. io1 EBS type is better, at 1000MB/s, but still can exhaust throughput capacity during periods of concurrent high bucket downloads and search that taxes the storage for both reads and writes in addition to ingestion and hot bucket replication. In AWS, it is highly recommended to use NVME for hot/cachemanager (i3 and i3en instance types work very well here) in RAID 0 and consider setting RF/SF=3 (still applies to hot buckets) to sleep better at night.
Disk part 3: If deploying S2 outside of AWS, strive to obtain the fastest disks (throughput & IOPS) available, whether local SSDs or NVME to avoid storage bottlenecks getting in the way of your Splunk performance.

Re: SmartStore Behaviors

twinspop — Mon, 08 Apr 2019 23:58:35 GMT

This is a really good rundown for anyone planning to use S2. Thanks for the summary @davidpaper!

Re: SmartStore Behaviors

woodcock — Sat, 13 Apr 2019 16:51:01 GMT

I think that it is worth noting explicitly (@davidpaper implied it) that currently, SS/S3 is currently NOT practical for hot/cold buckets/volumes and that it should ONLY be used for warm.

Re: SmartStore Behaviors

twinspop — Sun, 14 Apr 2019 01:06:39 GMT

I'm not sure I follow. You don't have a choice of WARM or COLD with S2. There is HOT; briefly there is WARM while waiting to upload to remote; and finally there is remote with cached local copies. Th entire bucket lifecycle changes.

At least this is my understanding.

Re: SmartStore Behaviors

davidpaper — Wed, 05 Jun 2019 19:46:12 GMT

Ah, this isn't really the case, but I can see how it might appear this way. There is now only "hot" and "not hot" in terms of a bucket lifecycle in S2. The concept of warm and cold being separate is no longer really a thing.

Hot (read/write) is still replicated based on CM RF/SF settings until it rolls to read-only, and then 1 copy is made of the bucket to S3, and the other local copies are marked for deletion by the indexers' cachemanager process.

The cachemanager retrieves read-only buckets from S3 when it needs to so a search can be completed and those bucket share the same file system as hot...so make sure your hot/cachemanager filesystem is nice and fast.

Re: SmartStore Behaviors

woodcock — Sun, 09 Jun 2019 21:51:58 GMT

After learning more, apparently a more proper statement is When using SmartStore, there is no need to use cold at all and Splunk should always configured to have NO COLD, or maybe not...?

Re: SmartStore Behaviors

davidpaper — Tue, 11 Jun 2019 13:59:23 GMT

Once an index is converted to use SmartStore, you are spot on. No more need for a coldPath entry for that index.

Edit: The above is incorrect. You still need a coldPath entry in indexes.conf for the index, but the cold volume shouldn't be actively used once the buckets have been evicted from there.

Re: SmartStore Behaviors

Steve_G_ — Tue, 11 Jun 2019 19:16:55 GMT

The index still requires a configured coldPath. See https://docs.splunk.com/Documentation/Splunk/7.3.0/Indexer/MigratetoSmartStore

Also, at the time that the index was migrated to SmartStore, any buckets that were in the coldPath continue to remain in the coldPath. See https://docs.splunk.com/Documentation/Splunk/7.3.0/Indexer/SmartStoreindexing

Re: SmartStore Behaviors

twinspop — Tue, 11 Jun 2019 20:09:57 GMT

I can confirm that once migration takes place, buckets are no longer stored on cold.
EDIT: I tried to force the CM to populate the cache with "cold buckets" but have failed to replicate this behavior. (Ran a search over a small window from months ago on a known index that would have been cold at the time of migration. No colddb population.)

Re: SmartStore Behaviors

Steve_G_ — Tue, 11 Jun 2019 20:25:17 GMT

Strictly speaking, it's true that the bucket contents will no longer be under coldPath, post-migration, as they are now stored remotely. But the bucket metadata should still be under coldPath, and bucket contents will get moved to coldPath if required to fulfill a search.

Re: SmartStore Behaviors

twinspop — Tue, 11 Jun 2019 20:28:03 GMT

The entirety of cold storage has 0 files in it. Remains true after running historical searches that would surely end up with some cold data in play.

Re: SmartStore Behaviors

Steve_G_ — Tue, 11 Jun 2019 20:35:12 GMT

Weird. Per developer, it's not supposed to work that way. I'll follow up and report back.

Re: SmartStore Behaviors

woodcock — Wed, 12 Jun 2019 19:33:08 GMT

@SloshBurch - We need a best practice wizard in here.

Re: SmartStore Behaviors

sloshburch — Tue, 16 Jul 2019 14:26:14 GMT

Thanks @woodcock. I hope to tackle smartstore soon and will revisit this at this time.

Re: SmartStore Behaviors

srajarat2 — Wed, 16 Oct 2019 16:14:30 GMT

After migration to SmartStore, the data on coldPath is not automatically removed unless it is forced out through eviction or through the natural aging process. As Steve pointed out, the coldPath will have metadata stubs and any searches that spans across the cold data will download the data from S3 back to the coldPath.

Alternatively, after migration, the coldPath location can be changed to some other location (or even homePath) as the idea for migration is to get only a single copy on to S3 and reclaim the space from the warm and cold tiers.

Re: SmartStore Behaviors

davidpaper — Wed, 16 Oct 2019 16:21:39 GMT

This is spot on, and a behavior I hadn't understood until very recently. Reassigning coldPath to homePath is an excellent idea.

Re: SmartStore Behaviors

jamie00171 — Mon, 21 Oct 2019 14:03:42 GMT

In my logs I see "deletes" files being downloaded, what is the deletes file in the bucket used for? Thanks

Re: SmartStore Behaviors

davidpaper — Mon, 21 Oct 2019 17:57:55 GMT

That file is where the info is stored to block events from showing up in search that have had "|delete" run against them in the past.

Re: SmartStore Behaviors

ypeng_splunk — Tue, 12 Nov 2019 06:15:55 GMT

Hi David, this is a great session.
Today, one Splunk instance identified some issues with smartstore on top of on-prem object storage. It worked normal since smartstore was enabled several months ago. Most of the time, the indexing rate per indexer is about 8-10MB/s. But, while there was a spike (not sure how much yet), indexer processor was stuck and consuming 100% CPU on indexer. All pipelines were blocked and couldn't be recovered. Indexing rate dropped to 2MB/s. They restarted the indexer. It went back to normal with index rate of 16MB/s.
Around 20min before the congestion, Some errors like "DatabaseDirectoryManager - failed to open bucket/waif for bucket to be local through CacheManager" started to be reported by indexer.
Their hot buckets are on SSD without RAID.

Any thought on this case?