Knowledge Management

Smartstore:SmartStore cache is not respecting cache limits

Splunk Employee
Splunk Employee

Smartstore doesn't appear to be respecting our disk usage limits via the DiskUsage & CacheMan stanzas (minFreeSpace & max_cache_size respectively). Is there way to purge smartstore data that is using this excessive space? i've attached a diag from one of our peers & our cluster master.

Labels (1)
Tags (3)
0 Karma

Splunk Employee
Splunk Employee

Splunk has attribute max_cache_size that set the limit for cacahemanager

eviction_padding = 10% of disk Space on Partition (in bytes)

Given that max_cache_size is not accounting for hot data, it is likely that space usage is going to be above max_cache_size.o limit the disk usage for both hot data + cached data, you can use minFreeSpace to restrict total disk usage. However, you need to have reasonable total space so hot buckets are not overcrowding the cache.
Splunk component CacheManager when put in debug mode provides the stats for key attribute that contribute to the size of Cacahemanager.

02-05-2020 21:09:41.898 +0000 DEBUG CacheManager - The system has freebytes=75354976256 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=313491456 maxSize=314572800

"The system has freebytes=" > freeBytes
" with minfreebytes=" > minFreeBytes
" cachereserve=" > evictionReservedBytes
" totalpadding=" > minFreeBytes + evictionReservedBytes
" buckets_size=" > buckets_size
" maxSize=" > maxSize

Now here is some information on each of this attribute.

1) " maxSize=" << _maxSize;

Now this comes from

*max_cache_size = *
* Specifies the maximum space, in megabytes, per partition, that the cache can
occupy on disk. If this value is exceeded, the cache manager starts
evicting buckets.
* A value of 0 means this feature is not used, and has no maximum size.
* Default: 0

2) " buckets_size=" << buckets_size
Now this is the total size of the buckets calculated.

3) " totalpadding=" << minFreeBytes + evictionReservedBytes

we can see here that totalpadding=10611589120 = (minfreebytes+cachereserve) = (5242880000+5368709120)

From the below, we can see that buckets_size can get bigger than maxSize for a temporary period but eventually we will slash it down to maxSize(max_cache_size) OR lesser.

Here in this example, we have

As you can see how buckets_size is varying with maxSize wrt time. Once we have buckets_size growing above maxSize, the CacheManager will ensure that we evict something and get back within maxSize.

max_cache_size = 300

alt text

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...