Deployment Architecture

How do you restore buckets in an indexer cluster?

Explorer

I have a bunch of buckets that I want to restore. According to documentation, the dirt step it finding the buckets you want to restore and then copying them to the $SPLUNK_HOME/var/lib/splunk/{INDEX}/thaweddb directory. Then you have to run the rebuild command. It is not clear in the documentation, however, whether or not you have to thaw the buckets on the same indexer that they came from.

I am looking at http://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/Restorearchiveddata

I am running version 6.5.2 with an indexer cluster.

First, the documentation says that on versions 4.2 and higher, you can thaw data on any indexer instance, not just the one that it originated on.

"For the most part, you can restore an
archive to any instance of the
indexer, not just the one that
originally indexed it. This, however,
depends on a couple of factors:Splunk
Enterprise version. You cannot restore
a bucket created by Splunk Enterprise
4.2 or later to a pre-4.2 indexer. The bucket data format changed between 4.1
and 4.2, and pre-4.2 indexers do not
understand the new format. This means:
4.2+ buckets: You can restore a 4.2+ bucket to any 4.2+ instance."

Then, at the bottom of the page, it talks about restoring data in a clustered environment and it says that you should place the buckets in the thawed directory of the indexer that it originated on:

"However, as described in "Archive
indexed data", it is difficult to
archive just a single copy of
clustered data in the first place. If,
instead, you archive data across all
peer nodes in a cluster, you can later
thaw the data, placing the data into
the thawed directories of the peer
nodes from which it was originally
archived."

Do I have to thaw buckets only on the indexer that the data origniated on?

0 Karma

SplunkTrust
SplunkTrust

Do I have to thaw buckets only on the
indexer that the data origniated on?

No, you can restore the bucket on any indexer instance that is running a newer than 4.x version. If you have the rawdata you will need to run the bucket rebuild under the section Thaw a 4.2+ archive of the documentation for restore archived indexed data (6.5.2 specific link)

In a clustered environment you may have multiple copies of a bucket that might make it more tricky to know which one to restore, but that will not effect restoring/thawing a bucket. You can restore it on a new instance or a current member, note that in some versions (6.5.x from memory) the thawed directory does not work as expected in a cluster until 6.5.7 (the workaround is to restore to a non-clustered instance!)

0 Karma

Builder

@sjcoluccio67: yes, according to the documentation if you are thawing buckets back to the cluster you have to thaw on the indexers where the bucket was originated...
I prefer to thaw the buckets(only db_*) on a stand-alone indexer and add it as a search-peer to the search-head.,in fact we did it in one of our use cases..

https://answers.splunk.com/answers/708814/when-backing-up-frozen-data-with-replication-facto.html#an...

0 Karma

Explorer

So, as long as the indexer is not part of the cluster that the buckets came from, it should be able to rebuild them all, regardless of the GUID in the bucket name?

0 Karma

Builder

yes, as long as the buckets are on a stand-alone indexer, GUIDs doesn't matter(you could also rename them, but I haven't tried that)

0 Karma