Deployment Architecture

How to stop a cluster?

Path Finder

I'm having 2 clusters in my Splunk environment located on 4 hosts.
Due to some patching the hosts need to be restarted and I need to make sure splunk clusters go down safely and after the restart they start properly.
How do I need to do that?

Tags (1)
0 Karma

Champion

I think you can go for stopping the instances one by one, And then use the rolling-restart command to start all of the peers. I also couldn't find anything about the stopping from master node.

0 Karma

Path Finder

Have you read this page ?
http://docs.splunk.com/Documentation/Splunk/5.0.3/Indexer/Restartthecluster

Everything standard is explained there.

0 Karma

Path Finder

Ok. gfuente has written some nice tips up there, and you can also find the proper order for upgrading Splunk here : http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Upgradeacluster and that will help you a lot regarding your project of patching up the servers themselves.

Path Finder

4 hosts, 2 clusters on 4 hosts, each cluster on 2 hosts. Each cluster: first host - cluster master + peer, second host - peer + search head.
The forwarders switch the indexers every 60 seconds.
My rep factor = 2, search factor = 2.
I do not need to put all the hosts at once down, I can do that sequentially. I can easily put down the hosts with peer+search_head down, but I do not know how to put down the host with the cluster master so the cluster starts working properly after starting cluster master again.

0 Karma

Path Finder

Well then, it all depends on how your forwarders are configured, what are your rep and search factors, and how the various functions are split across your 2 hosts : master node ? search head ? indexers ?

0 Karma

Path Finder

Yes, the restart is described there, but unfortunately my case is not a restart.
I need to stop the entire cluster (4 instances on 2 hosts), then the hosts are patched and restarted, and after that I need to start the whole cluster again. The patching might take few hours.

0 Karma

Motivator
0 Karma

Motivator

Then probably you should, use the offline command in each peer, restart it, then do the same in the other peers. Once all the peers have been restarted you only need to restart the master, in this case probably you should kill the master node splunk process. The cluster will continue working without the master node. Then restart the master and when it comes back online it will sychronize with the peers.

I think this will work.

Path Finder

Yes, the restart is described there, but unfortunately my case is not a restart.
I need to stop the entire cluster (4 instances on 2 hosts), then the hosts are patched and restarted, and after that I need to start the whole cluster again. The patching might take few hours.

0 Karma