Getting Data In

Splunk offline command and bucket replication

koshyk
Super Champion

Reading through the "offline" documentation & "maintenance" mode documentation, I'm slighly confused, if we need to do both to do a long maintenance (about 8 hours) for an indexer. We don't want replication to happen as the disks are retained and the indexer contains around 20TB of data and it will take huge time if data is replicated.

Can you please confirm if my below understanding is true?

  1. splunk offline (without enforce): This will ensure all searches are serviced and gracefully make indexer peer offline. But if the "master" detects that the server is NOT up within restart_timeout period (60 secs), it will start bucket fixing tasks by replicating the copy to other indexers etc.
  2. splunk enable maintenance mode will ensure no replication happens. So enabling maintenance mode + shutdown of the indexer will ensure data will remain intact until system is all brought back correctly?

So I was planning to do Option2. Just wanted to check if my understanding is correct or anyone have better way/process to do indexer maintenance for long duration?

0 Karma
1 Solution

lguinn2
Legend

I agree, the best solution in your case is to (1) Set the cluster in maintenance mode. (2) Shutdown the indexer and perform maintenance. (4) Restart the indexer. (5) Remove the cluster from maintenance mode.

This will prevent the cluster from going into recovery mode during the maintenance window. Once the cluster exits maintenance mode, however, it will enter recovery mode to "catch up" on the replication that it forestalled during the maintenance window; this is normal and expected behavior. And you are right that this is much better than having the cluster trying to recover 20TB of data...

View solution in original post

lguinn2
Legend

I agree, the best solution in your case is to (1) Set the cluster in maintenance mode. (2) Shutdown the indexer and perform maintenance. (4) Restart the indexer. (5) Remove the cluster from maintenance mode.

This will prevent the cluster from going into recovery mode during the maintenance window. Once the cluster exits maintenance mode, however, it will enter recovery mode to "catch up" on the replication that it forestalled during the maintenance window; this is normal and expected behavior. And you are right that this is much better than having the cluster trying to recover 20TB of data...

koshyk
Super Champion

thank you

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...