Solved: Re: Splunk Offline command - running for hours

baf879 · ‎04-04-2017

I've opened a support ticket but hoping someone may have seen this. I have an indexer cluster with two indexers and a cluster master and I'm upgrading all of them from 6.4.3 to 6.4.6.

CM was upgraded and placed into maintenance mode. Indexer 1 was taken offline (by using "splunk offline"), upgraded and rebooted.

On Indexer 2, issued a "splunk offline" command, and it's still running 5 hours later. The machine isn't locked - the status "dots" keep filling up the command window.

Has anyone encountered this, or is anyone aware of a way to check the actual offline status and possibly close the window? I was following along with the upgrade procedure, but can't find any mention of this situation anywhere.

baf879 · ‎04-05-2017

Splunk support did contact me this morning. We weren't able to determine an exact cause of this behavior, but did find that stopping the Splunkd process caused it to stop hanging. Specifically, the command prompt window where I had run "Splunk offline" displayed a message indicating that primaries had been reassigned and it was complete. I set the splunk service to start manually, rebooted the server and then installed the Splunk 6.4.6 update. It appears to be working now - it rejoined the cluster and I have not seen any signs that there is a problem.

*** I'll accept this as an answer with a caveat. I recommend contacting Splunk support in this situation, as they may identify something in the splunkd.log that points to a root cause, or may indicate that you should not terminate the process like I did ***

View solution in original post

baf879 · ‎04-05-2017

Splunk support did contact me this morning. We weren't able to determine an exact cause of this behavior, but did find that stopping the Splunkd process caused it to stop hanging. Specifically, the command prompt window where I had run "Splunk offline" displayed a message indicating that primaries had been reassigned and it was complete. I set the splunk service to start manually, rebooted the server and then installed the Splunk 6.4.6 update. It appears to be working now - it rejoined the cluster and I have not seen any signs that there is a problem.

*** I'll accept this as an answer with a caveat. I recommend contacting Splunk support in this situation, as they may identify something in the splunkd.log that points to a root cause, or may indicate that you should not terminate the process like I did ***

s2_splunk · ‎04-05-2017

That's great to hear, thank you for the update. I will convert your last comment to an answer. If you could accept it, so the question shows as resolved for others that may run into the same situation, that'd be great. Thanks!

s2_splunk · ‎04-04-2017

What OS are you running on?
Any error messages in the cluster master log?

baf879 · ‎04-04-2017

Windows Server 2012 for the indexers. Windows Server 2012 R2 for the cluster master.

Looking in splunkd.log on the CM, nothing that seems out of place (to me, at least). I see error messages about regex statements hitting a match limit (I use regex to blacklist some events), some warnings about cooked connections, and one of my search heads which is currently offline.

Most of the log contains INFO events pertaining to CMBucket - event=isFixupComplete

Splunk Offline command - running for hours

Aligning Observability Costs with Business Value: Practical Strategies

Mastering Data Pipelines: Unlocking Value with Splunk

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Are you a member of the Splunk Community?

Splunk Offline command - running for hours

Aligning Observability Costs with Business Value: Practical Strategies

Mastering Data Pipelines: Unlocking Value with Splunk

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0