During the Migration from to SmartStore following issues were faced.
Issue 1: Many of the Bucket were stuck up in fixup queue in state >>Waiting target_wait_time
before replication_bucket.
Recommended:
The cluster master won't put out build requests (Asking a peer to make a bucket searchable) if we're using S2 AND have use_batch_remote_rep_changes (Default is true).
I had to set use_batch_remote_rep_changes to false, and the cluster got to a stable all green state. It's a hidden attribute.
Issue 2: After the RF & SF was met and entire data searchable, disk storage on the homeopath has increased many folds. The Smart Store migration app has the configuration of
===server.conf=====
[cachemanager]
eviction_policy = noevict
The setting of “noevict” is recommended only during migration. At the end of the migration this need to be reverted back to default value of lru.
[cachemanager]
eviction_policy = lru
*Issue 3: It was found that max_size_kb=0, the setting means the cache manager is not configured. *
@Anonymous.conf:
[cachemanager]
max_size_kb=0
NOTE: max_cache_size in 7.2 is a replacement for max_size_kb in 7.1
To read more about it Refer :https://docs.splunk.com/Documentation/Splunk/7.1.4/Admin/Serverconf
Recommended following configuration:
4.1) [diskUsage]
minFreeSpace = 10% (disk Space on Partition)
To read more about it Refer :https://docs.splunk.com/Documentation/Splunk/latest/Admin/Serverconf
4.2) [cachemanager]
eviction_padding = 10% (disk Space on Partition) - in bytes
Refer: https://docs.splunk.com/Documentation/Splunk/latest/Admin/Serverconf
It was confirmed that setting eviction_padding and max_size_kb are used in 7.1.2 and above.
max_cache_size in 7.2 is a replacement for max_size_kb in 7.1
4.3)max_size_kb = 75% of partition.
In Splunk Version 7.1 : For settings pertinent to cache size limits, we actually use different units (bytes, KB, and MB)
max_size_kb (KB)
minFreeSpace (MB)
eviction_padding (bytes)
If free space on a partition falls below ('minFreeSpace' + 'eviction_padding'), then the cache manager tries to evict data from remote storage enabled indexes.
*Note1: It was confirmed that setting eviction_padding and max_size_kb are used in 7.1.2 and above.In relase7.2 max_cache_size has replaed it.
*Note_2: When up taking Splunk 7.2, please note that the "max_size_kb" and "eviction_padding" have been changed to MB
Issue 5: What are various options to check the size of the index.
I am still investigating this but here are some searches found – while investigating.
5,1) Total size of index files on each indexer (note the "cached=f"):
| dbinspect index=* cached=f |stats sum(sizeOnDiskMB) by splunk_server
5.2) Total size on remote:
| dbinspect index=* |stats sum(sizeOnDiskMB)
You can tweak index=* to use a different subset of indexes.
NOTE: This does not take into account RA and DMA summaries. If this is needed, then we will need a different strategy.
5.3) You can combine both and show both, in-cache and total data size:
| dbinspect index=data cached=f
|rename sizeOnDiskMB as szMB_incache
|join splunk_server bucketId
[dbinspect index=data |rename sizeOnDiskMB as szMB_total]
|table splunk_server bucketId szMB_total szMB_incache | addcoltotals |tail 1
Nice post, could you correct the typo's on "homeopath" and "replaed" with an edit, thanks!
If the splunk version is over 7.2.4.2 the setting of use_batch_remote_rep_changes is not required.
@rbal_splunk This setting "use_batch_remote_rep_changes" where is it set? server.conf?
The setting use_batch_remote_rep_changes needs to be in server.conf and placed outside of any of the stanzas. If it's in one of the stanzas, it'll be flagged as an invalid key.
So just put it at the top of the server.conf in etc/system/local/server.conf. At least that's where I put it and didn't receive any errors from btool