Deployment Architecture

What are the best practices for implementing Smartstore?

Chiranjeev88
Explorer

Hi All,

I am planning to implement SmartStore in my current splunk environment and wanted to know if i am only doing it for few indexes in my cluster do i still have to change and make the SF=RF in my cluster master.

Also if you could list the best practices that worked for you while implementing it ,that would be helpful as well.

 

Thanks,

Chiranjeev Singh

Labels (1)
0 Karma

jamie00171
Communicator

Hi @Chiranjeev88 

It's been a couple of years since I worked with SmartStore but from what I can remember:

1. Understand the search profile of the indexes you're planning to move to smart store via a combination of searching the _audit index and speaking to the users of data to confirm how many searches are run over last 24 hours, last 7 days, last 30 days etc. If there are frequent searches looking back over long time ranges then reconsider moving the index to SmartStore.

2. Configure an appropriate cache size (I believe the requirements will increase if using accelerated data models)

jamie00171_0-1674683007070.png

3. Consider the extra requirement on the indexers disk due to SmartStore. When a search runs that requires data from the remote store, after the files / buckets are downloaded they have to be written to disk. The default setting of "max_concurrent_downloads" is 8 which means there can be 8 buckets being downloaded concurrently and therefore 8 buckets being written to the disk concurrently. Due to this, Splunk recommended we changed our disks to NVMe SSD before moving to SmartStore. You can reduce some of the extra requirement on the disk by reducing the number of concurrent downloads; of course this comes at the cost of search performance.

4. Consider the performance / current hardware of the remote store. Some object stores can only read a certain number of "large" objects (tsidx or raw data files) at one time before the performance degrades significantly. If you have 8 concurrent downloads per indexer and 50 indexers then when a search runs you can potentially be trying to read 400 large objects at one time. Confirm the remote store can deal with this number of reads concurrently.

5. Consider using workload rules (https://docs.splunk.com/Documentation/Splunk/9.0.3/Workloads/WorkloadRules) to control who can search the SmartStore indexes over longer time ranges to avoid users unsuspectingly downloading a large number of buckets.

6. If you have a lot of sparse or super sparse searches running over longer time ranges then consider increasing the default of "hotlist_bloom_filter_recency_hours" in server.conf
(or indexes.conf to apply it to a specific index) to keep the bloomfilters and smaller metadata files for buckets on the disk of the indexers.

7. The recommended bucket size for SmartStore indexes is 750MB (maxDataSize = auto in indexes.conf) so if you have any auto_high_volume (10GB) indexes consider switching to auto a week or so before moving to SmartStore to ensure the buckets being searched are the recommended size.

8. Test SmartStore in production using a test index (e.g. by copying real data to a new index) before moving an index that is in use. This allows you to confirm the performance of the remote store and your indexers. Once you move an index to SmartStore you can't move it back.

Thanks, 

Jamie

 

richgalloway
SplunkTrust
SplunkTrust

Yes, you still need to set SF=RF.

#1 best practice: Don't use cloud-based S2 with on-prem indexers

---
If this reply helps you, Karma would be appreciated.
0 Karma

Chiranjeev88
Explorer

We have a on Prem implementation on AWS ,so should we still avoid smartstore implementation?

Tags (1)
0 Karma

Chiranjeev88
Explorer

We have a on Prem implementation on AWS ,so should we still avoid smartstore implementation?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

AWS is not on-prem (unless you own AWS  😀).

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...