Solved: Re: Difference between Splunk Maintenance Mode vs ...

jagadeeshm · ‎01-04-2017

We are trying to upgrade couple of indexers from our multi site cluster to a better hardware (16 core to 24 core etc). We decided to simply swap the disk to the new boxes to avoid unnecessary fix-up activities and save network traffic.

What is the best way to perform this upgrade?

I am thinking -

Initiate maintenance mode on the cluster by running “splunk enable maintenance-mode” command on the master node.
We have 4 indexers to upgrade, so we stop the splunkd process running on each indexer, one at a time by running “splunk stop” command
Move the disk to the new box
Start splunkd process on this server by running “splunk start” command
Repeat steps 2 to 4 for remaining indexers
Finally, disable maintenance mode by running “splunk disable maintenance-mode” command on the master node.

Or am I supposed to use Splunk Offline mode by extending the default interval?

Any advice?

somesoni2 · ‎01-04-2017

I would follow the cluster upgrade procedure (minus the upgrade tasks for cluster master and search head) to do this. The only addition from your list would to do run "splunk offline" on indexers/peer nodes before stopping them.

http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Upgradeacluster#Upgrade_to_a_maintenance_r...

View solution in original post

somesoni2 · ‎01-04-2017

I would follow the cluster upgrade procedure (minus the upgrade tasks for cluster master and search head) to do this. The only addition from your list would to do run "splunk offline" on indexers/peer nodes before stopping them.

http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Upgradeacluster#Upgrade_to_a_maintenance_r...

sjohnson_splunk · ‎01-04-2017

splunk offline actually stops the indexer.

How long do you think the process will take for each indexer?

Before you put the cluster in maintenance mode, you might consider increasing the restart timeout value to some number of seconds longer that the process will take:

splunk edit cluster-config -restart_timeout 900

Also be sure to take the cluster out of maintenance mode once you are done with the process.

vermasa · ‎09-19-2019

""After the peer shuts down, you have 60 seconds (by default) to complete any maintenance work and bring the peer back online. If the peer does not return to the cluster within this time, the master initiates bucket-fixing activities to return the cluster to a complete state. If you need more time, you can extend the time that the master waits for the peer to come back online by configuring the restart_timeout attribute""

But why does "restart_timeout" matter here ? when you are already putting cluster into maintenance mode which does not allow any bucket fixup activity.

jagadeeshm · ‎01-04-2017

As per https://answers.splunk.com/answers/464439/what-is-the-best-action-plan-during-hardwarefirmwa.html,
We don't even need to enable the maintenance mode? I am trying to avoid failed searches during this upgrade process.

somesoni2 · ‎01-04-2017

Yes, the maintenance mode enable is not a requirement to upgrade the peers, but not enabling maintenance mode has certain effect on the cluster health (too many bucket rolling may occur). For short duration to which the peers will be down, I would enable the maintenance mode. See this for more information on effect of not enabling maintenance mode.

https://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Usemaintenancemode

Difference between Splunk Maintenance Mode vs Splunk Offline mode?

indexer

splunkd

upgrade

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation