[SmartStore] Understand Retention with SmartStore?

rbal_splunk · ‎01-21-2019

We are planning to migrate to Smartstore and looking to understand the retention changes that come with it?

rbal_splunk · ‎03-05-2020

Starting 7.3 Splunk ensures that bucket copies are not evicted on target indexers after the hot bucket rolls to warm and is uploaded by source.

However for buckets that are already warm and that are being converted to s2 and are being uploaded by either the source or the targets, they would not be evicted on upload irrespective of the version. buckets would only be evicted eventually when there is cache pressure

rbal_splunk · ‎01-21-2019

Just like classic indexer clustering, for both clustered and non-clustered s2-enabled environments we need retention policies to freeze buckets. But unlike non-s2 environments, where we used to freeze and delete the buckets once they are eligible for freezing, we have to delete the buckets both locally and from remote storage so that all the bucket states are in sync and we don't end up introducing the bucket accidentally in the cluster again. This is taken care of by CMMasterRemoteStorageThread which runs every remote_storage_retention_period (defaults to 15 minutes) to check if there are buckets in remote storage that needs to be frozen in the cluster.

Below is how CMMasterRemoteStorageThread runs retention procedure -

i). It runs a search on all peers to retrieve the list of remote indexes with frozenTimePeriodInSecs, maxGlobalDataSizeMB and maxGlobalRawDataSizeMB information
ii). It then runs a search on all the peers to retrieve the list of the warm buckets that need to be frozen based on the frozenTimePeriodInSecs, maxGlobalDataSizeMB and maxGlobalRawDataSizeMB thresholds
iii). Does the following for each bucket:

Retrieve peers that have this bucket from CM
Assign this bucket to a peer that was randomly picked from the retrieved peer list

iv). For each peer that has bucket(s) assigned: Hit the clusterslavecontrol endpoint with the list of the buckets i.e. "/services/cluster/slave/control/control/freeze-buckets/" endpoint.
v). peer freezes the bucket and sends a notification to CM via "/services/cluster/master/control/control/freeze-buckets/".
vi). CM sends the notification back to the other peers which then freeze the bucket on their end by running a batch_bucket_frozen job which posts https request to peers via "/services/cluster/slave/control/control/freeze_buckets_by_list" endpoint.

Retention in S2 works on two policies -

a). Size based retention : This is controlled by maxGlobalDataSizeMB in indexes.conf. If an index crosses this threshold of index size across all indexers in a cluster, then the oldest data is frozen. This always take precedence over time-based retention i.e. frozenTimePeriodInSecs.

maxGlobalDataSizeMB = <nonnegative integer>
* The maximum amount of local disk space (in MB) that a remote storage
  enabled index can occupy, shared across all peers in the cluster.
* This attribute controls the disk space that the index occupies on the peers
  only. It does not control the space that the index occupies on remote storage.
* If the size that an index occupies across all peers exceeds the maximum size,
  the oldest data is frozen.
* For example, assume that the attribute is set to 500 for a four-peer cluster,
  and each peer holds a 100 MB bucket for the index. If a new bucket of size
  200 MB is then added to one of the peers, the cluster freezes the oldest bucket
  in the cluster, no matter which peer the bucket resides on.
* This value applies to hot, warm and cold buckets. It does not apply to
  thawed buckets.
* The maximum allowable value is 4294967295
* Defaults to 0, which means that it does not limit the space that the index
  can occupy on the peers.

b). Time-based retention : This is controlled by frozenTimePeriodInSecs in indexes.conf. Once the bucket is older then frozenTimePeriodInSecs, CMMasterRemoteStorageThread will then freeze the bucket across the cluster, both locally and on remote storage.

frozenTimePeriodInSecs = <nonnegative integer>
* Number of seconds after which indexed data rolls to frozen.
* If you do not specify a coldToFrozenScript, data is deleted when rolled to
  frozen.
* IMPORTANT: Every event in the DB must be older than frozenTimePeriodInSecs
  before it will roll. Then, the DB will be frozen the next time splunkd
  checks (based on rotatePeriodInSecs attribute).
* Highest legal value is 4294967295
* Defaults to 188697600 (6 years).

Here is flow with an example of "_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC"

i). CMMasterRemoteStorageThread retrieves the list of remote storage indexes just as mentioned in 3.i)

05-21-2019 02:24:00.292 +0000 INFO  CMMasterRemoteStorageThread - retrieving remote indexes info with search=| rest services/data/indexes datatype=all f=title f=frozenTimePeriodInSecs f=maxGlobalDataSizeMB f=remotePath f=disabled| search remotePath!="" AND disabled!=1| dedup title| fields title frozenTimePeriodInSecs maxGlobalDataSizeMB

ii) CMMasterRemoteStorageThread then retrieves the list of the warm buckets that need to be frozen just as mentioned in 3.ii)

05-21-2019 02:24:00.846 +0000 INFO  CMMasterRemoteStorageThread - Will initiate retrieving the list of buckets to be frozen for remote storage retention for index=_internal with frozenTimePeriodInSecs=2592000 and maxGlobalDataSizeMB=0
05-21-2019 02:24:00.846 +0000 INFO  CMMasterRemoteStorageThread - retrieving the list of buckets to be frozen for remote storage retention for index=_internal with search=| dbinspect index=_internal cached=true timeformat=%s| search state=warm OR state=cold| search modTime != 1| stats max(endEpoch) AS endEpoch BY bucketId| sort -endEpoch| search endEpoch<1555813440| fields bucketId, endEpoch

iii). It then retrieves the list of peers that have this bucket and then picks an indexer randomly to start freezing of bucket from remote storage as well as locally, in our case it is iTEST_INDEXER1:8089

05-21-2019 02:24:00.945 +0000 INFO  CMMasterRemoteStorageThread - Freezing bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC based on frozenTimePeriodInSecs
0505-21-2019 02:24:01.040 +0000 INFO  CMMasterRemoteStorageThread -  freezing buckets=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC as part of remote storage retention on hp=TEST_INDEXER:8089

iv). CM posts an http request to the indexer to freeze the bucket, as mentioned in 3.iv). Below entry is logged on indexer and the i.p XX.X.XX.XXX is of CM.

XX.X.XX.XXX - splunk-system-user [21/May/2019:02:24:01.046 +0000] "POST /services/cluster/slave/control/control/freeze-buckets HTTP/1.1" 200 1804 - - - 1ms

v). Peer now freezes the bucket and removes the bucket both from a remote storage and locally, as mentioned in 3.v)

05-21-2019 02:24:01.047 +0000 INFO  DatabaseDirectoryManager - cid="bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC|" found to be on remote storage
05-21-2019 02:24:01.047 +0000 INFO  BucketMover - Will freeze bkt=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC path='/opt/splunk/var/lib/splunk/_internaldb/db/db_1555290488_1554859220_697_9EA978AA-B109-473E-A526-F09AEE391FCC'
05-21-2019 02:24:01.047 +0000 INFO  BucketMover - RemoteStorageAsyncFreezer trying to freeze bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC, freezeInitiatedByAnotherPeer=false
05-21-2019 02:24:01.047 +0000 INFO  CacheManager - Evicted cachedId=bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC| freed_space=0 reason= manual earliest_time=1554859220 latest_time=1555290488
05-21-2019 02:24:01.047 +0000 INFO  CacheManager - will remove cacheId="bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC|" removeRemote=1
05-21-2019 02:24:01.303 +0000 INFO  CMSlave - deleteBucket bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC, frozen=true
05-21-2019 02:24:01.303 +0000 INFO  BucketMover - RemoteStorageAsyncFreezer freeze completed succesfully for bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC

NOTE: Couple of interesting things which can help in finding which peer initiated the freeze of the bucket on remote storage -

a) freezeInitiatedByAnotherPeer=false -> This means that freeze has been initiated by this peer and it is this peer's responsibility to freeze the bucket from remote storage too.

b) removeRemote=1 -> This means that we are removing the bucket from remote storage.

The corresponding entry in audit.log on the peer for removing the bucket from remote storage and evicting the bucket locally -

05-21-2019 02:24:01.301 +0000 INFO  AuditLogger - Audit:[timestamp=05-21-2019 02:24:01.301, user=n/a, action=remote_bucket_remove, info=completed, cache_id="bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC|", receipt_id=trek/_internal/db/b1/e 697~9EA978AA-B109-473E-A526-F09AEE391FCC/receipt.json, prefix=trek/_internal/db/b1/ee/697~9EA978AA-B109-473E-A526-F09AEE391FCC][n/a]
05-21-2019 02:24:02.230 +0000 INFO  AuditLogger - Audit:[timestamp=05-21-2019 02:24:02.230, user=n/a, action=local_bucket_evict, info=completed, cache_id="bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC|", kb=15, elapsed_ms=1, files="strings_data,sourcetypes_data,sources_data,hosts_data,lex,tsidx,bloomfilter,journal_gz,deletes,other"][n/a]

vi). Peer now sends a freeze notification back to CM. Below line is logged on CM and YY.Y.YY.YYY is the IP of the peer which sent the request to CM.

YY.Y.YY.YYY - splunk-system-user [21/May/2019:02:24:01.363 +0000] "POST /services/cluster/master/control/control//freeze-buckets/?output_mode=json HTTP/1.1" 200 305 - - - 0ms

Now, CM will remove the bid from its memory for this peer.

05-21-2019 02:24:01.363 +0000 INFO CMMaster - event=removeBucket remove bucket bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC from peer=65579980-63C7-4ABF-89F1-1E12D852E3F4 peer_name=TEST.com:8089 frozen=true
05-21-2019 02:24:01.363 +0000 INFO CMBucket - Freezing bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC on peer=TEST.com:8089 guid=65579980-63C7-4ABF-89F1-1E12D852E3F4 peer_name=TEST.com:8089
05-21-2019 02:24:01.363 +0000 INFO CMPeer - removing bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC from peer=65579980-63C7-4ABF-89F1-1E12D852E3F4 peer_name=TEST.com:8089

vii). At last CM will send a notification to other peers to remove the bucket at their end by running batch_bucket_frozen job which posts the http request to peers as mentioned in 3.vi)

05-21-2019 02:24:02.204 +0000 INFO CMRepJob - running job=batch_bucket_frozen guid=08D702A6-9EA2-4634-B07B-7D30D32DD87C hp=TEST2.com:8089 _internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC

POST request on peer from CM in splunkd_access.log:

XX.X.XX.XXX - splunk-system-user [21/May/2019:02:24:02.211 +0000] "POST /services/cluster/slave/control/control/freeze_buckets_by_list?output_mode=json HTTP/1.1" 200 322 - - - 1ms

Where XX.X.XX.XXX is IP of CM .

Peer finally removes the bucket from its local storage.

05-21-2019 02:24:02.212 +0000 INFO  DatabaseDirectoryManager - cid="bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC|" found to be on remote storage
05-21-2019 02:24:02.212 +0000 INFO  BucketMover - Will freeze bkt=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC path='/opt/splunk/var/lib/splunk/_internaldb/db/db_1555290488_1554859220_697_9EA978AA-B109-473E-A526-F09AEE391FCC'
05-21-2019 02:24:02.212 +0000 INFO  BucketMover - RemoteStorageAsyncFreezer trying to freeze bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC, freezeInitiatedByAnotherPeer=true
05-21-2019 02:24:02.212 +0000 INFO  CacheManager - Evicted cachedId=bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC| freed_space=0 reason= manual earliest_time=1554859220 latest_time=1555290488
05-21-2019 02:24:02.212 +0000 INFO  CacheManager - will remove cacheId="bid|_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC|" removeRemote=0
05-21-2019 02:24:02.220 +0000 INFO  CMSlave - deleteBucket bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC, frozen=true
05-21-2019 02:24:02.220 +0000 INFO  BucketMover - RemoteStorageAsyncFreezer freeze completed succesfully for bid=_internal~697~9EA978AA-B109-473E-A526-F09AEE391FCC

Since this is the peer which received the request from CM to freeze/remove the bucket locally, hence we have freezeInitiatedByAnotherPeer=true and removeRemote=0.

ltawfall · ‎05-18-2022

@rbal_splunk

Looks like the specs have changed since you wrote this article. Seems like your description of maxGlobalDataSizeMB is now called maxGlobalRawDataSizeMB and maxGlobalDataSizeMB seems to apply to Smart Store.

maxGlobalRawDataSizeMB = <nonnegative integer>
* The maximum amount of cumulative raw data (in MB) allowed in a remote
  storage-enabled index.
* This setting is available for both standalone indexers and indexer clusters.
  In the case of indexer clusters, the raw data size is calculated as the total
  amount of raw data ingested for the index, across all peer nodes.
* When the amount of uncompressed raw data in an index exceeds the value of this
  setting, the bucket containing the oldest data is frozen.
* For example, assume that the setting is set to 500 and the indexer cluster
  has already ingested 400MB of raw data into the index, across all peer nodes.
  If the cluster ingests an additional amount of raw data greater than 100MB in
  size, the cluster freezes the oldest buckets, until the size of raw data
  reduces to less than or equal to 500MB.
* This value applies to warm and cold buckets. It does not
  apply to hot or thawed buckets.
* The maximum allowable value is 4294967295.
* Default: 0 (no limit to the amount of raw data in an index)

maxGlobalDataSizeMB = <nonnegative integer>
* The maximum size, in megabytes, for all warm buckets in a SmartStore
  index on a cluster.
* This setting includes the sum of the size of all buckets that reside
  on remote storage, along with any buckets that have recently rolled
  from hot to warm on a peer node and are awaiting upload to remote storage.
* If the total size of the warm buckets in an index exceeds
  'maxGlobalDataSizeMB', the oldest bucket in the index is frozen.
  * For example, assume that 'maxGlobalDataSizeMB' is set to 5000 for
    an index, and the index's warm buckets occupy 4800 MB. If a 750 MB
    hot bucket then rolls to warm, the index size now exceeds
    'maxGlobalDataSizeMB', which triggers bucket freezing. The cluster
    freezes the oldest buckets on the index, until the total warm bucket
    size falls below 'maxGlobalDataSizeMB'.
* The size calculation for this setting applies on a per-index basis.
* The calculation applies across all peers in the cluster.
* The calculation includes only one copy of each bucket. If a duplicate
  copy of a bucket exists on a peer node, the size calculation does
  not include it.
  * For example, if the bucket exists on both remote storage and on a peer
    node's local cache, the calculation ignores the copy on local cache.
* The calculation includes only the size of the buckets themselves.
  It does not include the size of any associated files, such as report
  acceleration or data model acceleration summaries.
* The highest legal value is 4294967295 (4.2 petabytes.)
* Default: 0 (No limit to the space that the warm buckets on an index can occupy.)

We just completed a rather painful migration to SmartStore and now we and reviewing retention settings and seeing how to control growth.

Your Smart Store articles have been super helpful to us in solving a number of our problems. Do you have any additional recommendation/insights in how to manage storage in the Smart Store world?

Sahr_Lebbie · ‎03-24-2020

Great answer @rbal_splunk

Definitely took time to answer this in depth.

Appreciated,

[SmartStore] Understand Retention with SmartStore?

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

Alerting Best Practices: How to Create Good Detectors

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...