Re: SmartStore cache manager not enforcing limit ...

rbal_splunk · ‎06-29-2020

Cluster indexer across the site is configured with Smartstore. Each indexer has 6TB partition that is utilized by $SPLUNK_HOME+$SPLUNK_DB

The Cache Manager is configured as below

$SPLUNK_HOME/etc/system/default/server.conf [diskUsage]
$SPLUNK_HOME/etc/system/default/server.conf minFreeSpace = 5000
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf [cachemanager]
$SPLUNK_HOME/etc/system/default/server.conf evict_on_stable = false
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf eviction_padding = 5120
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf eviction_policy = lru
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf hotlist_bloom_filter_recency_hours = 720
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf hotlist_recency_secs = 604800
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf max_cache_size = 4096000
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf max_concurrent_downloads = 8
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf max_concurrent_uploads = 8
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf remote.s3.multipart_max_connections = 4
$SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf remote.s3.multipart_upload.part_size = 536870912

The indexer is showing partition 6TB partition 97% utilized , although it should not have crossed 4TB based on max_cache_size = 4096000

Filesystem 1K-blocks Used Available Use% Mounted ondevtmpfs 71967028 0 71967028 0% /devtmpfs 71990600 0 71990600 0% /dev/shmtmpfs 71990600 4219944 67770656 6% /runtmpfs 71990600 0 71990600 0% /sys/fs/cgroup/dev/nvme0n1p2 20959212 6812056 14147156 33% /none 71990600 0 71990600 0% /run/shm/dev/nvme1n1 6391527336 5864488560 204899848 97% /opt/splunktmpfs 14398120 0 14398120 0% /run/user/1003

Here is Debug entry for CacheManger

{06-10-2020 19:32:42.604 +0000 DEBUG CacheManager - The system has freebytes=210838605824 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000
06-10-2020 19:32:42.607 +0000 DEBUG CacheManager - The system has freebytes=210838536192 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000
06-10-2020 19:32:46.502 +0000 DEBUG CacheManager - The system has freebytes=210850021376 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000
06-10-2020 19:32:46.505 +0000 DEBUG CacheManager - The system has freebytes=210850172928 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000
06-10-2020 19:33:06.727 +0000 DEBUG CacheManager - The system has freebytes=210255511552 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000

Note From DEBUG observation :

freebytes= 210072649728
minfreebytes= 5242880000
cachereserve= 5368709120
totalpadding= 10611589120
buckets_size= 3069785296896 <<<<<< 3TB As calculated by cacahemanager
maxSize= 4294967296000 <<<<<< configured 4TB limit

The issue is cache has almost utilized 6TB of disk space but as per the calculation it shows usage of 3TB. Due to this miscalculation Splunk is not evicting the buckets.

rbal_splunk · ‎06-29-2020

The computation for max_cache_size is broken across 7.2.6 to 8.0.4.

As a workaround best will be enforced cache limit max_cache_size=0 and eviction_padding=<CONFIGURE_AS_PER_DESIRED_LIMIT>

For detail on JIRA contact Splunk Support

kranthimutyala · ‎08-11-2020

HI @rbal_splunk We are in the plan of implementing smart store in our existing environment(non clustered indexer distributed environment).I have few queries reg this.

We have 15 indexers and each has 9TB of total disk space and Daily volume ingestion is ~5TB .

Please let me know how much cache size we need to reserve for 30days retention in smart store

We are afraid to put max_cache_size=0 since it would occupy the entire free space and would cause problems.

Additionally what if we decomm the servers and build new servers , how to link the old smart store data to new indexers

Please suggest here . Thanks

rbal_splunk · ‎08-12-2020

current we have some issue in the calculation of the cache_size so recommendation won't be to set max_cache_size=0 (to ignore it)

To manage the cache size use

https://docs.splunk.com/Documentation/Splunk/8.0.5/Admin/Serverconf

eviction_padding = <positive integer>
* Specifies the additional space, in megabytes, beyond 'minFreeSpace' that the
  cache manager uses as the threshold to start evicting data.
* If free space on a partition falls below
  ('minFreeSpace' + 'eviction_padding'), then the cache manager tries to evict
  data from remote storage enabled indexes.
* Default: 5120 (~5GB)

set this value to disk space you would like to keep empty.

rjfneto_pags · ‎11-24-2020

How can I see these Debug entry for CacheManger?

Thanks in advance

garyjohnson48 · ‎11-24-2020

Go to settings/server settings/server logging/

type in the search bar cachemanager

click on cachmanager and change your logging level to debug.

Search index=_internal component=cachemanager

This only works on the server that you just implemented it on. You can also change it in your $Splunk_Home/etc folder.

SmartStore cache manager not enforcing limit enforced by

indexer clustering

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

Transform your security operations with Splunk Enterprise Security

Splunk Admins and App Developers | Earn a $35 gift card!