Deployment Architecture

Is it possible to control where to write replicated data (hot/warm/cold) in cluster (single/multi) for Splunk 6.x?

lmyrefelt
Builder

Hi,

Is there anyone out there that knows if it is possible to control where to write the replicated data in a cluster scenario ? It was said during version 5 that it was written to your cold storage .. however this is said to have been changed in version 6 and higher ( changed to hot/warm if i am not remembering wrong ) .

However in some scenarios, for example where we have some highly costly but super duper fast pcie storage or similar .. that is is much point wasting our disk space, money and i/o on writing and storing replicated data. This we would like to be able to write to cold-storage or EVEN BETTER, be able to control with some setting in indexes.conf to a disk or set of inexpensive disks , which makes way more sense .

Anyone with experience in this area ?

Clarification; How can we control that the replicated data in a cluster environment is not taking up space within Hot/Warm data, which is retain in/on a high-performance storage.
Rather we want to be able to have more granular control over where to store the replicated data so that we can write it down do more inexpensive and larger disks. ( we are fine with the fact that this data will be slower to search or "re-index" again in the case of a failure)

0 Karma
1 Solution

lmyrefelt
Builder

After talking to Splunk support we have turned this into an feature requests so hopefully in an not to far distance we can finally get controll on where to write the replicated data and thus do not need to waste valuable space on superspeeedy disks 🙂

View solution in original post

lmyrefelt
Builder

After talking to Splunk support we have turned this into an feature requests so hopefully in an not to far distance we can finally get controll on where to write the replicated data and thus do not need to waste valuable space on superspeeedy disks 🙂

View solution in original post

lmyrefelt
Builder

I filed support-case / enchantment request @ case: 224314

0 Karma

lmyrefelt
Builder

No Updates on this?
:)

0 Karma

luhadia_aditya
Path Finder

Well, You are correct that with Splunk 6.x (or higher) replicated data is written in to hot/warm buckets.

Now, ans to your question is NO. It is not configurable to specify on to what target peer (with faster disks in this scenario) your secondary (replicated) bucket should reside. It is done randomly by the Master Node.

Source - http://docs.splunk.com/Documentation/Splunk/6.1.3/Indexer/Bucketsandclusters

"Each time the source peer starts a new hot bucket, the master gives the peer a new set of target peers to replicate data to. Therefore, while the original copies will all be on the source peer, the replicated copies of those buckets will be randomly spread across the other peers.This behavior is not configurable. The one certainty is that you will never have two copies of the same bucket on the same peer. In the case of a multisite cluster, you can also configure the site location of the replicated copies, but you still cannot specify the actual peer location."

Hope this helps! Thanks!