Deployment Architecture
Highlighted

Frozen Buckets - Replicate or Localize

New Member

I'm mulling a design and need some tips! Scenario: a customer has a multi-site cluster (assume low latency) with RF-2, and they want to implement Frozen buckets on NAS. Peer Nodes are placed across sites (no peer nodes local to eachother).

Option 1: Setup a NAS in each site and point the Indexer nodes to archive frozen to the local NAS export. Since RF-2, this lends itself cleanly to a two site cluster, and would not result in any theoretical complexities or additional unnecessary copies (ie RF-3). Array replication would not be needed. In the event of a single NAS Failure, cold Buckets are still available to be thawed as the second site in theory has a similar frozen bucket copy. Are my assumptions correct or any corrections to my thoughts here?

Option 2: Setup a NAS in each site, however one site will be the primary NAS. All indexers across both sites would point their frozen buckets to this NAS (one datacenter would have an advantage since the NAS is local, the other datacenter would have to send data across the wire). To protect against DU, the NAS is replicated to a passive NAS located at the other datacenter using array replication. CONs - array replication doubles the amount of copies to 4x. (Indexer Node A archives to Primary NAS (1), Indexer Node B archives to Primary NAS (2) < both in theory the same data; but now the array is replicated - thus 4x copies of data). Any corrections to my thoughts here?

Option 3: Setup a NAS in each site, however one site will be the primary NAS. Only the indexers in the site local to the primary NAS would archive Frozen buckets to it. The peer indexers at the secondary site would simply delete their buckets instead of rolling to frozen. To protect against DU, the NAS is replicated using array based replication to the passive NAS located at the other datacenter. This would in theory only result in 2x copy of the data similar to option 1. From what I can tell, Splunk does not recommend doing this:
https://docs.splunk.com/Documentation/Splunk/7.2.5/Indexer/Automatearchiving

Would love to hear your thoughts on this and what option you would take.

0 Karma