It seems that my current process of quarantining a search peer and then running 'splunk offline' causes searches to become zombified.
"This search has encountered a fatal error and has been marked as zombied."
Is it best practice to quarantine the peer before or after running the splunk offline command? I know that running 'splunk offline' graceful haults new searches from reaching that indexer but for some reason, I think there is an interference when quarantining the host first and then running 'splunk offline'.
Thanks richgalloway. My goal is upgrade the indexer cluster without the end user seeing warnings when a peer goes down for the upgrade. Since removing the quarantine task, things are a little better,however, I am still occasionally get the "connection refused for peer=x" when the peer goes down via 'splunk offline' and a search was ran at the same time.
Is it nearly impossible to perform an indexer cluster upgrade without a few "connection refused" warnings when a search is ran during a peer being down?