Getting Data In

Splunk offline command and bucket replication

koshyk
Super Champion

Reading through the "offline" documentation & "maintenance" mode documentation, I'm slighly confused, if we need to do both to do a long maintenance (about 8 hours) for an indexer. We don't want replication to happen as the disks are retained and the indexer contains around 20TB of data and it will take huge time if data is replicated.

Can you please confirm if my below understanding is true?

  1. splunk offline (without enforce): This will ensure all searches are serviced and gracefully make indexer peer offline. But if the "master" detects that the server is NOT up within restart_timeout period (60 secs), it will start bucket fixing tasks by replicating the copy to other indexers etc.
  2. splunk enable maintenance mode will ensure no replication happens. So enabling maintenance mode + shutdown of the indexer will ensure data will remain intact until system is all brought back correctly?

So I was planning to do Option2. Just wanted to check if my understanding is correct or anyone have better way/process to do indexer maintenance for long duration?

0 Karma
1 Solution

lguinn2
Legend

I agree, the best solution in your case is to (1) Set the cluster in maintenance mode. (2) Shutdown the indexer and perform maintenance. (4) Restart the indexer. (5) Remove the cluster from maintenance mode.

This will prevent the cluster from going into recovery mode during the maintenance window. Once the cluster exits maintenance mode, however, it will enter recovery mode to "catch up" on the replication that it forestalled during the maintenance window; this is normal and expected behavior. And you are right that this is much better than having the cluster trying to recover 20TB of data...

View solution in original post

lguinn2
Legend

I agree, the best solution in your case is to (1) Set the cluster in maintenance mode. (2) Shutdown the indexer and perform maintenance. (4) Restart the indexer. (5) Remove the cluster from maintenance mode.

This will prevent the cluster from going into recovery mode during the maintenance window. Once the cluster exits maintenance mode, however, it will enter recovery mode to "catch up" on the replication that it forestalled during the maintenance window; this is normal and expected behavior. And you are right that this is much better than having the cluster trying to recover 20TB of data...

koshyk
Super Champion

thank you

0 Karma
Get Updates on the Splunk Community!

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Pack your bags (and maybe your dancing shoes)—Cisco Live is heading to San Diego, June 8–12, 2025, and Splunk ...

Splunk App Dev Community Updates – What’s New and What’s Next

Welcome to your go-to roundup of everything happening in the Splunk App Dev Community! Whether you're building ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco + Splunk! We’ve ...