I am using Barracuda Yosmite Backup software to backup all my servers. Splunk on my Linux box is the only one giving me grief out of 40 others. I know a lot of the errors are changes between the scan and the actual backup process (temp files, etc.. that get created and deleted in seconds of each other).
Example output:
\Network\sigweb.signaturescience.com\File Systems\splunk_data\splunk\firewall\db\hot_v1_4691\rawdata
Error 3020: Object not found Bkp: 35766044
\Network\sigweb.signaturescience.com\File Systems\splunk_data\splunk\firewall\db\hot_v1_4691
Error 3020: Object not found Bkp: 1362042196-1362042194-7792846081280182826.tsidx
Error 3020: Object not found Bkp: 1362042198-1362042194-6395177518873184810.tsidx
Error 3020: Object not found Bkp: 1362045744-1362041885-2820434847812558834.tsidx
\Network\sigweb.signaturescience.com\File Systems\splunk_data\splunk\windows_data\db\hot_v1_618\rawdata
Error 3020: Object not found Bkp: 482298675
I just want to ensure that all my log data that I have been collecting, settings, filters, and other core and customization of splunk apps are backed up. I do not need indexes, or stuff that can be regenerated if I had to rebuild the server.
So three questions:
The docs on this subject are very helpful: http://docs.splunk.com/Documentation/Splunk/5.0.2/Indexer/Backupindexeddata
The docs page also references this blog post that is even more hands-on: http://blogs.splunk.com/2011/12/20/index-backup-strategy/
I know this is a bit of a late response but my employer was also looking for a backup solution in the event something happened, after some research it dawned on me that using something like vss (volume shadow copy service) or veam backup and replication (our environment is virtualized) should do the trick. It might be something to considered.
Hey @noncon21 - could you suggest some softwares to use for this purpose in physical env. (from your knowledge)
In Splunk terms, an index is what you would otherwise call a database. This is where all data goes.
I just want the backups to work, backing up only what is collected, backing up what Splunk does with that data to make it searchable I assume can be "re-indexed". Maybe I am not speaking "Splunk", but in the past, indexes were temporary, can be rebuilt, such as cache data.
The docs on this subject are very helpful: http://docs.splunk.com/Documentation/Splunk/5.0.2/Indexer/Backupindexeddata
The docs page also references this blog post that is even more hands-on: http://blogs.splunk.com/2011/12/20/index-backup-strategy/
I do not want to reinvent the wheel. I already have a backup tool, just trying to get Splunk to slow down enough or take a coffee break for me to get a decent backup. I was really hoping this could be done easily.
Did you get an answer to this , i have the same issue. Is there a command i can run before i run my tool that will back up the whole directory. Or do i have to Stop SPLUNK before i do a back up?
So I have a pre-backup line that allows me to execute a single line before backups start. Usually if they are multi-line, you tell it to run as a script.
What would be the best line to tell Splunk to role all hot databases to warm and pause indexing and collecting data?
-- at which point I would backup the server.
Then I would use the "post-execution" line that allows me to resume indexing and collecting of data?
Are there two lines that could be executed on the command line that would do this?
Yes, the hot buckets are volatile and need to be rolled to warm to be of any use when you restore your backup. So, either just ignore the hot buckets and accept that in doing so you will miss out on the data they hold, or make sure they're rolled to warm right before you do your backup. The docs and the blog post may speak about this in 'manual' terms, but you could script this just as well.
All these links lead to manually backing up Splunk using what eludes to scheduled or manually invoked batch scripts or commands to backup the data.
I already have a tool that will backup the entire system (weekly full) and then do incremental throughout the weekdays.
I would like to use this tool (Yosmite: http://www.barracudaware.com/products/server-backup), exclude I am guessing the "hot" databases, as those are in constant flux - right?
"I do not need indexes" <-- ???
The indexes carry all your log data, so I'm not sure you really mean this...