Hi,
We are currently looking into using the smartstore feature, however, I am having difficulty in finding documentation on how to calculate the amount of storage we would need - both local for the cache and on our S3 solution. The only detail I can find is
The amount of local storage available on each indexer for cached data must be in proportion to the expected working set. For best results, provision enough local storage to accommodate the equivalent of 30 days' worth of indexed data. For example, if the indexer is adding approximately 100GB/day of indexed data, the recommended size reserved for cached data is 3000GB
Using this example does it mean that each of our indexers would need 3TB each local storage for the cache or would it be the same as the traditional storage method where this would be divided by the number of indexers?
Remote Object Store sizing = Daily Ingest Rate x Compression Ratio x Retention period
Compression ratio is generally 50% (15% from the compression of rawdata and 35% from the tsidx metadata files) but this is entirely dependent on the type of data. For higher cardinality data, this percentage can go down resulting in lower compressed data or increase in the storage sizing requirement.
Global Cache sizing = Daily Ingest Rate x Compression Ratio x (RF x Hot Days + (Cached Days - Hot Days))
Cache sizing per indexer = Global Cache sizing / No.of indexers
Cached Days = Splunk recommends 30 days for Splunk Enterprise and 90 days for Enterprise Security
Hot days = Number of days before hot buckets roll over to warm buckets. Ideally this will be between 1 and 7 but configure this based on how hot buckets rolls in your environment.
This is helpful. Can you provide links to the documentation containing these formulas?
The Cache sizing should be updated given Splunk 8.0+ will have RF number of copies across the cache until they are evicted. When SmartStore was introduced in 7.2, the behavior was to keep only one copy of the bucket in the cache after the warm bucket is uploaded into S3 but would have had RF-1 number of stubs (just metadata). The behavior has changed since 8.0 (not sure if it was from 7.3+) for performance reasons.
Global Cache sizing = Daily Ingest Rate x Compression Ratio x RF x Cached Days
According to this documentation:
https://docs.splunk.com/Documentation/Splunk/8.1.3/Indexer/SmartStorearchitecture
"When the hot bucket rolls to warm, the data flow diverges, however. The source indexer copies the warm bucket to the remote object store, while leaving the existing copy in its cache, since searches tend to run across recently indexed data. The target indexers, however, delete their copies, because the remote store ensures high availability without the need to maintain multiple local copies.The master copy of the bucket now resides on the remote store."
Once the warm data buckets roll to remote storage, target indexers delete their copies.
This is based on the following settings in server.conf of the indexers.
evict_on_stable
By default, this parameter is set to false, which means the RF copies are preserved in the cache until they are evicted by the cache eviction policies.
@srajarat2 thank you for your note. This helps!
So, we could set the flag to false to override this behavior, correct?
Also, it is not clear how much of the metadata is retained in the target peers after eviction(assuming this flag is set to true). Could you share your thoughts.
Copy-paste from server.conf.spec (8.2.0) for future reference.
evict_on_stable = <boolean> * When the source peer completes upload of a bucket to remote storage, it notifies the target peers so that they can evict any local copies of the bucket. * When set to true, each target peer evicts its local copy, if any, upon such notification. * When set to false, each target peer continues to store its local copy, if any, until its cache manager eventually evicts the bucket according to its cache eviction policy. * Default: false
I am assuming you meant setting this parameter to TRUE so only one single copy can be maintained in the cache.
I don't know the exact details of the metadata maintained, but certainly know the cluster master maintains details about bucket locality (remote vs local, whether a copy is uploaded to remote storage etc). Given the SmartStore architecture, the indexers are generally stateless so they shouldn't be having metadata info in them. Note: this is my assumption and might be wrong. If anyone else has good knowledge about metadata with Splunk SmartStore, please feel free to update.
Take your total daily ingestion rate, divide by the number of indexers, and multiply by 30. That’s what you should have per indexer. This of course assumes your working set is 30 days. If you can quantify the time span your average searches cover, you can adjust accordingly. Most data probably is “stale” after 7 days, though this of course depends on your use case.
it should be daily indexed rate as per Splunk docs and not ingested rate
The cache is for each indexer.