I have been searching for ways to back up my Splunk 6.4.1 which is on CentOS and got these results:
So my questions are:
1) Regarding the command to roll hot buckets to warm, do I need to stop Splunk before calling the command in scripts?
2) My company is using Symantec Backup Exec to perform backup, does anyone know if it meets the suggestion by Splunk of using snapshot technology for Splunk backup? Anyone has had experience in using Backup Exec to back up Splunk?
As mentioned, it's CentOS. In the manual of BE, it states that "The Agent for Linux uses advanced open file and image technologies that are designed to alleviate the issues that are sometimes encountered during backup operations, such as backing up open files.
After you make file and folder selections and submit the job for backup, the Linux Agent automatically makes a snapshot of the volume or volumes. Making a snapshot of a volume provides a point-in-time record of the data. When the Linux Agent
creates a snapshot, it uses snapshot technologies to momentarily suspend write activity to a volume so that a snapshot of the volume can be created. During the backup, files can be open and data can be changed."
So I cant add much in regards to what BE does, however I will say this regarding rolling buckets and snapshots...
1) You can roll the buckets via the documented REST endpoints and CLI commands without stopping Splunk (When you stop / start Splunk, all hot buckets are rolled to Warm anyways.)
2) Take snapshots works best on the warm / cold / frozen / thawed buckets. Snapshotting Hot could be problematic as if you have to restore it, most likely there will be some issues with the buckets not being complete, or similar issues. So we dont recommend snapshotting hot. You're best to roll to warm and then backup the warm buckets. In BE you can make your backup selection the db_* / rb_* directories and blacklist the hot_* from the snapshots.
esix, by CLI commands, do you mean: splunk _internal call /data/indexes//roll-hot-buckets –auth : ? So this command does not require shutting down Splunk beforehand, right?
Any updates on the comments above, please?
I imagine that the main objective to use a snapshot would be to capture hot buckets, otherwise it would be hard to justify the snapshot overhead. Can you share example scenarios from in-house research or customer installations where hot buckets were unrecoverable or caused other problems when restored from a snapshot backup? I'm intrigued by your comment because the documentation at http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Backupindexeddata states that forced bucket rolls are "not generally recommended" and seems to imply that snapshots are an acceptable option to back up hot buckets.
I haven't found much in-depth documentation or even discussion on Answers about real-world backup; good to see more discussion about this!
Exactly! They recommend using backup solutions with snapshot technology rather than rolling hot buckets to be backed up by scripts.....haha