Getting Data In

How to move _internaldb to a new partition in a Clustered indexer environment

Contributor

Ahhh, the joys of running clustered indexers.

I need to find a way to move the _internaldb out from /opt/splunk/var/lib/splunk to /splunk/hot (how/warm buckets) and /splunk/cold (cold buckets) in a clustered environment.

The config is easy: in the cluster master's $SPLUNK/etc/master-apps/_cluster/local/indexes.conf I'll add an entry in there for the _internaldb and point it at the right place -- stealing the original config from $SPLUNK/etc/master-apps/_cluster/default/indexes.conf.

What I can't figure out is the mechanics of the move.

I have to make the change at the cluster master layer, and have it push the change out. But, if I do that, then the old _internaldb data will get orphaned in the old location unless it gets copied to the new space.

Should I take all 4 indexers down at once, move the _internaldb data to the new space manually, edit the config on each indexer manually and then restart them? Then update the config on the cluster master and re-push the bundle?

Not sure how to proceed here.

1 Solution

Splunk Employee
Splunk Employee

The most straightforward way would be to stop/offline the entire cluster, move the existing location, update the config and reload the cluster.

If you want to avoid having the whole cluster down though, you can do this one or two nodes at a time. The key is to note that you can move the orphaned data after you've update the config. So, in this approach, you can update the config over the whole cluster, then once that's done, go back to each node in turn, offline it, move the data, then bring it back online, then move to the next node. The biggest problem you'll have here is that you'll wind up with conflicting bucket ids, so when you move the data, you'll have to check and rename the old buckets with new non-conflicting ids numbers

You could also just cheat a bit. You're not supposed to have indexes.conf differerent on different nodes, but it's actually probably okay since it would only be temporary. Go to each node, take it offline, stick in a config in a new temporary app that overrides the default index location, move the buckets, start up the node. Repeat for each node. Then, when you're done, update the master config and push it out. Then, at some point in the future, go remove the temporary app. No rush, since the effective configuration you have with and without the app is identical, so you shouldn't have any run-time problems. But you should nevertheless remove it because in the long run you don't want to have the possibility of inconsistent local configs.

View solution in original post

Splunk Employee
Splunk Employee

The most straightforward way would be to stop/offline the entire cluster, move the existing location, update the config and reload the cluster.

If you want to avoid having the whole cluster down though, you can do this one or two nodes at a time. The key is to note that you can move the orphaned data after you've update the config. So, in this approach, you can update the config over the whole cluster, then once that's done, go back to each node in turn, offline it, move the data, then bring it back online, then move to the next node. The biggest problem you'll have here is that you'll wind up with conflicting bucket ids, so when you move the data, you'll have to check and rename the old buckets with new non-conflicting ids numbers

You could also just cheat a bit. You're not supposed to have indexes.conf differerent on different nodes, but it's actually probably okay since it would only be temporary. Go to each node, take it offline, stick in a config in a new temporary app that overrides the default index location, move the buckets, start up the node. Repeat for each node. Then, when you're done, update the master config and push it out. Then, at some point in the future, go remove the temporary app. No rush, since the effective configuration you have with and without the app is identical, so you shouldn't have any run-time problems. But you should nevertheless remove it because in the long run you don't want to have the possibility of inconsistent local configs.

View solution in original post

Contributor

I took the cheating route. It worked nicely, though I did end up going through 3 full rounds of clustered indexer restarts. This procedure should probably end up in public Splunk docs somewhere, as more people start using clustered indexers, they'll end up having to move indexes once in a while. Thanks!

0 Karma