Activity Feed
- Got Karma for Re: Is there a REST API call for getting the status of a Search Head Cluster (SHC)?. 11-22-2024 04:08 AM
- Got Karma for Re: [SmartStore]Trigger that would cause a cluster to resync from remote storage. 08-15-2024 12:25 PM
- Got Karma for Re: [SmartStore]Trigger that would cause a cluster to resync from remote storage. 05-08-2024 01:36 AM
- Got Karma for Re: [SmartStore] What is Cluster bootstrap process?. 05-04-2024 06:10 AM
- Got Karma for Re: [smartstore] splunk smartstore and Data integrity. 01-04-2024 05:15 PM
- Got Karma for Large lookup caused the bundle replication to fail. What are my options?. 11-20-2023 12:13 PM
- Got Karma for Re: Large lookup caused the bundle replication to fail. What are my options?. 11-20-2023 12:13 PM
- Got Karma for Re: Smartstore:SmartStore cache is not respecting cache limits. 11-10-2023 09:22 PM
- Got Karma for Re: [smartstore] How to map S2 smartstore buckets to local splunk bucket?. 08-18-2023 09:03 AM
- Got Karma for Re: access.log indexed multiple time. 07-10-2023 02:20 AM
- Got Karma for Re: Too many Events generated for Windows Security EventCode 4662 causing high resource issues like CPU. 04-24-2023 09:28 AM
- Got Karma for Re: [SmartDtore] How to Analyse the CacheSize?. 04-04-2023 02:02 AM
- Got Karma for Re: [SmartStore] How is the Replication of Summary bucket managed in Splunk Smartstore?. 01-18-2023 05:39 PM
- Got Karma for Large lookup caused the bundle replication to fail. What are my options?. 01-06-2023 01:45 PM
- Got Karma for Re: Large lookup caused the bundle replication to fail. What are my options?. 01-06-2023 01:45 PM
- Posted Re: Post upgrade of 3 Node Search Head Cluster from vesrion 8.2.7 ,one SHC Kvstore status as DOWN on Knowledge Management. 11-29-2022 01:00 PM
- Posted Post upgrade of 3 Node Search Head Cluster from vesrion 8.2.7 ,one SHC Kvstore status as DOWN on Knowledge Management. 11-29-2022 12:54 PM
- Got Karma for Re: data rebalance progresses is very poor or getting stuck. 08-14-2022 12:37 AM
- Posted Re: Is there any controls to limit the size of a user search on Splunk Search. 07-25-2022 09:03 AM
- Posted Is there any controls to limit the size of a user search? on Splunk Search. 07-25-2022 08:58 AM
Topics I've Started
Subject | Karma | Author | Latest Post |
---|---|---|---|
0 | |||
0 | |||
0 | |||
0 | |||
0 | |||
0 | |||
0 | |||
0 | |||
0 | |||
0 |
11-29-2022
01:00 PM
The mongod.log, it is failed to recover because of OplogStartMissing, which is a known issue https://jira.mongodb.org/browse/SERVER-40954 Error: 2022-11-29T05:01:57.080Z I REPL [rsBackgroundSync] Starting rollback due to OplogStartMissing: Our last op time fetched: { ts: Timestamp(1669697961, 2), t: 79 }. source's GTE: { ts: Timestamp(1669698089, 2), t: 80 } hashes: (6527934590833943207/-6009016642415496648)
2022-11-29T05:01:57.102Z F ROLLBACK [rsBackgroundSync] RecoverToStableTimestamp failed. :: caused by :: UnrecoverableRollbackError: No stable timestamp available to recover to. You must downgrade the binary version to v3.6 to allow rollback to finish. You may upgrade to v4.0 again after the rollback completes. Initial data timestamp: Timestamp(1669697961, 2), Stable timestamp: Timestamp(0, 0) To resolve the issue # splunk stop # splunk clean kvstore --local # splunk start Once the KVStore is up, it was on 4.0 . Manually upgraded kvstore to 4.2 as per "Upgrade KV store server to version 4.2" and documentation https://docs.splunk.com/Documentation/Splunk/9.0.2/Admin/MigrateKVstore
... View more
11-29-2022
12:54 PM
The env was on 8.2.7. the environment has 3 Node Search Head Cluster. Nodes upgraded from version 8.2.7 to 9.0.2. Post upgrade for one SHC member the kvstore status was DOWN.
... View more
Labels
- Labels:
-
kvstore
07-25-2022
09:03 AM
you would be controlling this with `authorize.conf` srchTimeWin srchTimeEarliest and WLM rules
... View more
07-25-2022
08:58 AM
Is there any controls to limit the size of a user search? The use case is Splunk Cloud and limiting a search, if it downloads for example more than 10TB from SmartyStore to the cache.
... View more
- Tags:
- splunk-search
Labels
- Labels:
-
search job inspector
05-20-2021
08:17 AM
The problem about RAID5/6 even with SSDs, and especially with SmartStore, is that you add at least two dimensions of access patterns and obviously a lot more linear write (download from SmartStore to the local cache) to the game. A normal IDX does random read/write for ingesting and a lot more read while searching. The upload of a rolled bucket will need another linear read of the bucket and a linear write if you download the bucket again. So even more IO and remember that write IO will stress the RAID because you have to calc the checksums… I think this was learning from our own tests with RAID in AWS. Also, why waste space/iops on your cache, if you already have a copy in S3 (smart store for stable bucket) or on other hosts (RF for buckets that haven’t been uploaded)
... View more
05-20-2021
08:15 AM
Why avoid RAID5 on SSD when using SmartStore?
... View more
Labels
- Labels:
-
installation
11-12-2020
06:39 PM
The view is based on search index="pci_posture_summary" search_name="PCI - Compliance Status History - Summary Gen" | `makemv(orig_tag)` | `mvappend_field(tag,orig_tag)` | extract kv_for_pci_compliance_status_history_summary | timechart span=`pci_compliance_history_span` latest(All) as All If you look at the SPL for the base search for "PCI - Compliance Status History - Summary Gen", it has following results Each of the requirement refers to scorecards on "PCI Compliance Posture" Based on the search for "Compliance Status History" - Where “All” requirement has rolled up number from another score cards on - The logic is, when we have new notable i.e ( where investigation has not started ) , in this case we will show compliance_status= - 10000000000 -In case we have notable that are being investigated they will have compliance_status=0 -If all the investigation get closed -when the search run in that case compliance_status= 10000000000
... View more
11-12-2020
06:23 PM
The issue is for the “PCI Compliance Posture” dashboard the View “Compliance Status History” is not showing data. It just displays. It just displayed line
... View more
Labels
- Labels:
-
PCI compliance
11-12-2020
05:21 PM
This isn't an issue. We ship with references to tag=filtered, but we don't explicitly filter anything out of the box.For, now you can replace panel's search query in pci_posture.xml with below search query and the customer can see events in the panel: index="pci_posture_summary" search_name="PCI - Compliance Status History - Summary Gen" | `makemv(orig_tag)` | `mvappend_field(tag,orig_tag)` | extract kv_for_pci_compliance_status_history_summary | timechart span=`pci_compliance_history_span` latest(All) as All What we have done here is we have removed the filtering condition from the search query.
... View more
11-12-2020
05:20 PM
VERSION=8.0.6 ES version= version = 6.1.0 Splunk_DA-ESS_PCICompliance=4.1.0 Issue is for the “PCI Compliance Posture” dashboard the View “Compliance Status History” is not showing data. It just displays "Unable to find tag filtered"
... View more
Labels
- Labels:
-
PCI compliance
10-22-2020
09:57 PM
Cache manager evict buckets when (i) the total disk utilized by WARM and Cold Buckets exceeds max_cache_size or (ii)The current free space for partition falls below eviction_padding+minFreeSpace max_cache_size Specifies the maximum space, in megabytes, per partition, that the cache can occupy on disk. If this value is exceeded, the cache manager starts evicting buckets. If max_cache_size=0 it means this feature is not used, and has no maximum size, in this case, the eviction will happen when the $SPLUNK_DB partition's free space drops below eviction_padding+minFreeSpace. The total cache usage is calculated by the Cache manager as the sum of all NONE hot buckets size. When $SPLUNK_HOME and $SPLUNK_DB are on different partitions, assuming all of the caches in $SPLUNK_DB will account for disk spaces in that partitions only. DEBUG on CacheManager shows entries stats for cache manager 07-08-2020 19:19:58.806 +0000 DEBUG CacheManager - The system has freebytes=944143511552 with minfreebytes=471859200000 cachereserve=471859208192 totalpadding=943718408192 buckets_size=0 maxSize=0 07-08-2020 19:19:58.887 +0000 DEBUG CacheManager - The system has freebytes=944141152256 with minfreebytes=471859200000 cachereserve=471859208192 totalpadding=943718408192 buckets_size=0 maxSize=0 Where Freebytes>> freeBytes Minfreebytes >>minFreeBytes Cachereserve >> evictionReservedBytes Totalpadding >> minFreeBytes + evictionReservedBytes buckets_size >> max_cache_size
... View more
10-22-2020
09:53 PM
Could you please help understand the DEBUG option for CacheManager to instigate eviction?
... View more
Labels
- Labels:
-
smartstore
10-22-2020
05:55 PM
Here are the steps that could work for you. Enable smartstore on indexes Ensure all buckets have been replicated to smartstore ( migration complete ) Place indexer into offline mode Delete ALL buckets from coldpath Re-point coldpath to hot_warm or homePath volume. Remove cold store EBS Restart splunkd Run splunk boot strap coommand on the Cluster Master Run “/opt/splunk/bin/splunk _internal call /services/cluster/master/control/control/init_recreate_index -method POST” on cluster master
... View more
10-22-2020
05:53 PM
We would like to remove EBS volumes which were used for cold store and DM summary Docs is not overly clear on the recommended approach https://docs.splunk.com/Documentation/Splunk/7.3.4/Indexer/MigratetoSmartStore.
... View more
Labels
- Labels:
-
configuration
10-08-2020
01:04 PM
Here are the steps that can be used to check the Report acceleration and the corresponding bucket upload to the remote store. OR else you could also use the REST endpoint |rest /servicesNS/-/-/admin/summarization splunk_server=local | table summary.hash,summary.id,summary.is_inprogress,summary.size,summary.time_range, summary.complete,saved_searches.admin;search;* Normalized Summary Id=NS6f37597da0cade4c” as would match with the name highlighted below in the bucket path. $SPLUNK_HOME/bin/splunk cmd splunkd rfs -- ls --starts-with volume:my_s3_vol | grep -i '/ra/'
4220,_internal/ra/0c/a8/26~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/metadata
75,_internal/ra/0c/a8/26~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/metadata_c
71680,_internal/ra/0c/a8/26~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/ra_data
799,_internal/ra/0c/a8/26~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/receipt.json
4220,_internal/ra/37/73/28~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/metadata
75,_internal/ra/37/73/28~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/metadata_c
71680,_internal/ra/37/73/28~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/ra_data
799,_internal/ra/37/73/28~949FE8DD-2419-4F07-A151-77B02413A437/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/receipt.json
6294,_internal/ra/3c/fd/27~DA6E5901-FAF9-4AC1-855C-8C5E53A87B23/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS6f37597da0cade4c/guidSplunk-949FE8DD-2419-4F07-A151-77B02413A437/metadata
... View more
10-08-2020
01:00 PM
After Smartstore was enabled for deployment the indexer's log's are flooded with messages like "INFO CacheManagerHandler - cache_id="ra|tto_uswest2_tomcatfrontend~39~4345D76C-80D6-4BC7-991F-EA835C2B892C|08281223-D92B-4A36-BCA0-83970376D322_tto_search_agupta13_NS2480590abee10f99" not found cache_id = ra|tto_uswest2_tomcatfrontend~39~4345D76C-80D6-4BC7-991F-EA835C2B892C|" What the best way to find teh bucket corresponding to report acceleration.
... View more
Labels
- Labels:
-
stats
08-12-2020
08:18 AM
current we have some issue in the calculation of the cache_size so recommendation won't be to set max_cache_size=0 (to ignore it) To manage the cache size use https://docs.splunk.com/Documentation/Splunk/8.0.5/Admin/Serverconf eviction_padding = <positive integer>
* Specifies the additional space, in megabytes, beyond 'minFreeSpace' that the
cache manager uses as the threshold to start evicting data.
* If free space on a partition falls below
('minFreeSpace' + 'eviction_padding'), then the cache manager tries to evict
data from remote storage enabled indexes.
* Default: 5120 (~5GB) set this value to disk space you would like to keep empty.
... View more
07-20-2020
10:39 AM
1 Karma
For indexer cluster, the summary is created on the peer node that is primary for the associated bucket or buckets. The peer then uploads the summary to remote storage. When a peer needs the summary, its cache manager fetches the summary from remote storage. Summary replication between peers is not needed and the uploaded summary is available to all peer nodes. Here is an example from my env that shows Report Acceleration bucket on remote store. [root@centos65-64sup02 rbal]#$SPLUNK_HOME/bin/splunk cmd splunkd rfs -- ls --starts-with index:main | grep -v '/db/' #for full paths run: splunkd rfs -- ls --starts-with volume:my_s3_vol/main/
size,name
1080,main/ra/38/01/16~3D41EF74-A16D-421D-9FD7-83B3849101B2/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS16c348adc086860d/guidSplunk-3D41EF74-A16D-421D-9FD7-83B3849101B2/metadata.csv
75,main/ra/38/01/16~3D41EF74-A16D-421D-9FD7-83B3849101B2/3F3F537C-7DAD-4CF8-B062-168D17BC15C7_search_admin_NS16c348adc086860d/guidSplunk-3D41EF74-A16D-421D-9FD7-83B3849101B2/metadata_checksum NOTE: In this example, the "guidSplunk-3D41EF74-A16D-421D-9FD7" is the GUID of the search head where data is accelerated. so in my case based on the report Accelerated search and time range only the relevant buckets were accelerated. Splunk has an open bug where SPL-186425:S2: Rebuilding an evicted DMA summary causes us to re-upload the old tsidx file with the newly rebuilt one. This means that we would see upload/download of the bucket when buckets are being accelerated. As per this JIRA: When rebuilding an evicted DMA summary, for some reason we localize the remote copy first and in parallel, we begin to rebuild the summary on disk. For the graph posted the upload activity was due to report acceleration 03A227A6-442C-4EC2-96BA-EDB3AEBCB2DF_XXXX_commerce_products_emmett_NSab5a2628876cea87
03A227A6-442C-4EC2-96BA-EDB3AEBCB2DF_XXXX_partnerships_jamesw_NS9c8a6f5149bf222c To get the name of the corresponding Splunk report uses the REST endpoint on the search head. | rest servicesNS/-/-/admin/summarization
|table saved_searches.admin;search;test_support_ra.name,summary.hash,summary.earliest_time,summary.complete,summary.id, summary.complete,summary.id
... View more
07-20-2020
10:12 AM
there has been a huge spike in the number of uploads, resulting in many more failed uploads from throttling than we had before. It is currently unclear to me what caused this. Whether constant retries are underlying the huge spike, or some new data being uploaded have caused this. The bucket size has remained pretty constant, but the number of daily uploads has gone from about 80k to 4 million. Looking at some of s3 access logs, it seems like search objects are getting uploaded? Most of these uploads are for "ra" (Report Acceleration bucket) index=_internal host=<XXX> sourcetype=splunkd action=upload status=succeeded NOT cacheId=ra* | rex field=cacheId "bid\|(?<indexname>\w+)\~\w+\~" | timechart span=1m partial=f limit=50 per_second(kb) as kbps by indexname index=_internal host=<XXX> sourcetype=splunkd action=upload status=succeeded NOT cacheId=ra* | rex field=cacheId "bid\|(?<indexname>\w+)\~\w+\~" | timechart span=1m partial=f limit=50 per_second(kb) as kbps by indexname
... View more
- Tags:
- smartstore
06-30-2020
11:29 AM
1 Karma
The best way to manage would be to enable s3 bucket versioning and s3 access logs. Monitor for Splunk buckets with more than one version in s3. if data integrity exists to detect alterations to splunk bucket data files, then s3 object versioning is a great way to detect alterations. So, for smart store enabled indexes, integrity control is offloaded to the object storage. Typical implementations of version control and object logging can be utilized to have similar functionality of data integrity control.
... View more
06-30-2020
11:27 AM
This question has come up a few times, how does Splunk handle data integrity in large ES implementation. On Splunk docs, it states 'Data integrity control feature. SmartStore-enabled indexes are not compatible with the data integrity control feature, described in Manage data integrity in the Securing Splunk Enterprise manual. As covered in https://docs.splunk.com/Documentation/Splunk/8.0.4/Indexer/AboutSmartStore
... View more
Labels
- Labels:
-
indexer clustering
06-29-2020
12:55 PM
The computation for max_cache_size is broken across 7.2.6 to 8.0.4. As a workaround best will be enforced cache limit max_cache_size=0 and eviction_padding=<CONFIGURE_AS_PER_DESIRED_LIMIT> For detail on JIRA contact Splunk Support
... View more
06-29-2020
12:50 PM
Cluster indexer across the site is configured with Smartstore. Each indexer has 6TB partition that is utilized by $SPLUNK_HOME+$SPLUNK_DB The Cache Manager is configured as below $SPLUNK_HOME/etc/system/default/server.conf [diskUsage] $SPLUNK_HOME/etc/system/default/server.conf minFreeSpace = 5000 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf [cachemanager] $SPLUNK_HOME/etc/system/default/server.conf evict_on_stable = false $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf eviction_padding = 5120 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf eviction_policy = lru $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf hotlist_bloom_filter_recency_hours = 720 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf hotlist_recency_secs = 604800 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf max_cache_size = 4096000 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf max_concurrent_downloads = 8 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf max_concurrent_uploads = 8 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf remote.s3.multipart_max_connections = 4 $SPLUNK_HOME/etc/slave-apps/_cluster/local/server.conf remote.s3.multipart_upload.part_size = 536870912 The indexer is showing partition 6TB partition 97% utilized , although it should not have crossed 4TB based on max_cache_size = 4096000 Filesystem 1K-blocks Used Available Use% Mounted ondevtmpfs 71967028 0 71967028 0% /devtmpfs 71990600 0 71990600 0% /dev/shmtmpfs 71990600 4219944 67770656 6% /runtmpfs 71990600 0 71990600 0% /sys/fs/cgroup/dev/nvme0n1p2 20959212 6812056 14147156 33% /none 71990600 0 71990600 0% /run/shm/dev/nvme1n1 6391527336 5864488560 204899848 97% /opt/splunktmpfs 14398120 0 14398120 0% /run/user/1003 Here is Debug entry for CacheManger {06-10-2020 19:32:42.604 +0000 DEBUG CacheManager - The system has freebytes=210838605824 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000 06-10-2020 19:32:42.607 +0000 DEBUG CacheManager - The system has freebytes=210838536192 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000 06-10-2020 19:32:46.502 +0000 DEBUG CacheManager - The system has freebytes=210850021376 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000 06-10-2020 19:32:46.505 +0000 DEBUG CacheManager - The system has freebytes=210850172928 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000 06-10-2020 19:33:06.727 +0000 DEBUG CacheManager - The system has freebytes=210255511552 with minfreebytes=5242880000 cachereserve=5368709120 totalpadding=10611589120 buckets_size=3069799919616 maxSize=4294967296000 Note From DEBUG observation : freebytes= 210072649728 minfreebytes= 5242880000 cachereserve= 5368709120 totalpadding= 10611589120 buckets_size= 3069785296896 <<<<<< 3TB As calculated by cacahemanager maxSize= 4294967296000 <<<<<< configured 4TB limit The issue is cache has almost utilized 6TB of disk space but as per the calculation it shows usage of 3TB. Due to this miscalculation Splunk is not evicting the buckets.
... View more
Labels
- Labels:
-
indexer clustering
06-26-2020
08:43 AM
- The coldpath is needed during migration when pre-existing data is migrated to SmartStore. - As discussed in our documentation “Cold buckets can, in fact, exist in a SmartStore-enabled index, but only under limited circumstances. Specifically, if you migrate an index from non-SmartStore to SmartStore, any migrated cold buckets use the existing cold path as their cache location, post-migration. In all respects, cold buckets are functionally equivalent to warm buckets. The cache manager manages the migrated cold buckets in the same way that it manages warm buckets. The only difference is that the cold buckets will be fetched into the cold path location, rather than the home path location” coldPath and homePath can point to the same volume, but different directories like. homePath = volume:hot/$_index_name/db coldPath = volume:hot/$_index_name/colddb So in your case, if you have already migrated to Smart store, so now you can point the coldPath to use the volume same as homepath.
... View more
06-26-2020
08:41 AM
We migrated almost all of our existing indexes from traditional indexes with separate warm and cold mount paths to smartstore a little under a year ago. It's all worked great, however for indexes with long term retention, buckets that were in the coldPath at the time of smartstore converstion continue to be stubbed out and localized from S3 back into the coldPath, while everything since conversion uses the warm path, as expected since that mount is the SPLUNK_DB definition used by the smartstore indexes. I want to re-map the SPLUNK_COLD path to use the same OS mount, but what is the supported way to do that with smartstore? From the documentation (https://docs.splunk.com/Documentation/Splunk/7.3.3/Indexer/Moveanindex) it sounds like you would normally manually copy the data from the old to the new path, and then re-map the variable, however with smart store does it work the same? Or is it just something like force clearing the smartstore cache on the OS mount I want to clear off, re-mapping the variable, and then new localization of buckets simple uses the re-mapped path?
... View more
- Tags:
- smartstore
Labels
- Labels:
-
indexer