Deployment Architecture

Index Cluster - Backup stratetegy

Path Finder

Hi All,

We have 3 indexers in a Index cluster with a Index master. Currently, the data is being backed up periodically to AWS S3/Glacier storage.

We want to understand if we need to shutdown before we backup the buckets

Reviewed this

0 Karma

Super Champion

Yes, stop the splunk services before doing backup.

For cluster'ed data backup...

Clustered data backups
Even though an indexer cluster already contains redundant copies of data, you might also want to back up the cluster data to another location; for example, to keep a copy of the data offsite as part of an overall disaster recovery plan.

The simplest way to do this is to back up the data on each individual peer node on your cluster, in the same way that you back up data on individual, non-clustered indexers, as described earlier in this topic. However, this approach will result in backups of duplicate data. For example, if you have a cluster with a replication factor of 3, the cluster is storing three copies of all the data across its set of peer nodes. If you then back up the data residing on each individual node, you end up with backups containing, in total, three copies of the data. You cannot solve this problem by backing up just the data on a single node, since there's no certainty that a single node contains all the data in the cluster.

The solution to this would be to identify exactly one copy of each bucket on the cluster and then back up just those copies. However, in practice, it is quite a complex matter to do that. One approach is to create a script that goes through each peer's index storage and uses the bucket ID value contained in the bucket name to identify exactly one copy of each bucket. The bucket ID is the same for all copies of a bucket. For information on the bucket ID, read "Warm/cold bucket naming convention". Another thing to consider when designing a cluster backup script is whether you want to back up just the bucket's rawdata or both its rawdata and index files. If the latter, the script must also identify a searchable copy of each bucket.

Because of the complications of cluster backup, it is recommended that you contact Splunk Professional Services for guidance in backing up single copies of clustered data. They can help design a solution customized to the needs of your environment.

Get Updates on the Splunk Community!

Dashboard Studio Challenge - Learn New Tricks, Showcase Your Skills, and Win Prizes!

Reimagine what you can do with your dashboards. Dashboard Studio is Splunk’s newest dashboard builder to ...

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Take the 2021 Splunk Career Survey for $50 in Amazon Cash

Help us learn about how Splunk has impacted your career by taking the 2021 Splunk Career Survey. Last year’s ...