Deployment Architecture

Disable replication factor for cold buckets only, is it possible?

kfchen
Explorer

So i currently have an indexer cluster, the RF and SF is 2. My hot/warm Db and my cold bucket will be on different storage disks in a cluster which has its own replication features, and i happen to have EC+2:1, meaning the data on my cold storage will be replicated twice. 

As a result, i would like to disable replication on my cold storage, but there is currently no way to do that in Splunk(or not that I know of). I am thinking of writing a cron job that deletes all replicated bucket in the cold storage disk. For this to happen, all of the indexers should be referring to a single shared file path in the cold storage.

However, this begs the question: Will the search still works as per normal? Lets say the main bucket is on Indexer A  and the replicated copy is in indexer B. But my indexer A is currently under maintenance, would it be possible for lets say, index B, to query the bucket with indexer A's bid? Additionally, will indexer B sense that something is wrong and try to replicate the bucket in warm bucket again?

Labels (2)
0 Karma
1 Solution

isoutamo
SplunkTrust
SplunkTrust

Shortly you can’t do it. Even you can succeed to remove those “additional” buckets, when splunk recognize that cluster has lost SF or RF it starts to rebuild missed buckets. This will happened again and again until retention time of those buckets have fulfilled.

And even you could do it some weird way you will lose all support from Splunk side when (I don’t say if) you have any issues with your environment.

It’s much better that you configure your storage to avoid that replication or use some other storage instead of it.

View solution in original post

kfchen
Explorer

Thanks all for the insightful discussion. Upon further research i realized i am not supposed to let my indexers see the other indexers file as well, so that's one more reason why this idea wont work out.

 

Cheers

isoutamo
SplunkTrust
SplunkTrust

Shortly you can’t do it. Even you can succeed to remove those “additional” buckets, when splunk recognize that cluster has lost SF or RF it starts to rebuild missed buckets. This will happened again and again until retention time of those buckets have fulfilled.

And even you could do it some weird way you will lose all support from Splunk side when (I don’t say if) you have any issues with your environment.

It’s much better that you configure your storage to avoid that replication or use some other storage instead of it.

kiran_panchavat
Influencer

@kfchen 

  • Unfortunately, Splunk does not provide a built-in way to disable replication specifically for cold storage. Your idea of using a cron job to delete replicated buckets in cold storage is creative, but it comes with risks. Deleting replicated buckets manually might lead to data inconsistency and potential search issues. 
  • If Indexer A is under maintenance and the main bucket is on Indexer A, Indexer B should still be able to query the replicated bucket. However, this depends on the search factor (SF) being met. If the SF is not met, searches might not return complete result.
  • If Indexer A is down and Indexer B detects that it needs to meet the replication factor (RF), it will attempt to replicate the bucket to another indexer. This process is part of Splunk's mechanism to ensure data availability and redundancy.
  • Configuring all indexers to refer to a single shared file path for cold storage is possible. You would need to modify the indexes.conf file to set the coldPath to a shared directory.However, ensure that the shared storage is reliable and has sufficient performance to handle the load.

Before proceeding with any changes, it's crucial to test your setup in a staging environment to avoid any disruptions in your production environment. Please contact Splunk support or PS. 

NOTE:-

Official answer from support is to NOT remove any replicated buckets even with clustering disabled, as they may be marked as the Primary Bucket. It is best to let them age out.

I hope this helps, if any reply helps you, you could add your upvote/karma points to that reply, thanks.
Get Updates on the Splunk Community!

Index This | How many sides does a circle have?

  March 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

New This Month - Splunk Observability updates and improvements for faster ...

What’s New? This month, we’re delivering several enhancements across Splunk Observability Cloud for faster and ...

What's New in Splunk Cloud Platform 9.3.2411?

Hey Splunky People! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2411. This release ...