Index Cluster - Backup stratetegy

ykpramodhcbt · ‎08-21-2018

Hi All,

We have 3 indexers in a Index cluster with a Index master. Currently, the data is being backed up periodically to AWS S3/Glacier storage.

We want to understand if we need to shutdown before we backup the buckets

Reviewed this http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/Backupindexeddata

inventsekar · ‎08-21-2018

Yes, stop the splunk services before doing backup.

For cluster'ed data backup...
http://docs.splunk.com/Documentation/Splunk/7.1.2/Indexer/Backupindexeddata

Clustered data backups
Even though an indexer cluster already contains redundant copies of data, you might also want to back up the cluster data to another location; for example, to keep a copy of the data offsite as part of an overall disaster recovery plan.

The simplest way to do this is to back up the data on each individual peer node on your cluster, in the same way that you back up data on individual, non-clustered indexers, as described earlier in this topic. However, this approach will result in backups of duplicate data. For example, if you have a cluster with a replication factor of 3, the cluster is storing three copies of all the data across its set of peer nodes. If you then back up the data residing on each individual node, you end up with backups containing, in total, three copies of the data. You cannot solve this problem by backing up just the data on a single node, since there's no certainty that a single node contains all the data in the cluster.

The solution to this would be to identify exactly one copy of each bucket on the cluster and then back up just those copies. However, in practice, it is quite a complex matter to do that. One approach is to create a script that goes through each peer's index storage and uses the bucket ID value contained in the bucket name to identify exactly one copy of each bucket. The bucket ID is the same for all copies of a bucket. For information on the bucket ID, read "Warm/cold bucket naming convention". Another thing to consider when designing a cluster backup script is whether you want to back up just the bucket's rawdata or both its rawdata and index files. If the latter, the script must also identify a searchable copy of each bucket.

Because of the complications of cluster backup, it is recommended that you contact Splunk Professional Services for guidance in backing up single copies of clustered data. They can help design a solution customized to the needs of your environment.

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !

Index Cluster - Backup stratetegy

Can’t make it to .conf25? Join us online!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Calling All Security Pros: Ready to Race Through Boston?

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Are you a member of the Splunk Community?

Index Cluster - Backup stratetegy

Can’t make it to .conf25? Join us online!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Calling All Security Pros: Ready to Race Through Boston?

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...