Splunk Enterprise Security
Highlighted

Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

I am doing an upgrade of Enterprise Security from 3.3.1 to 4.0 through the GUI. I installed the app by providing it the file, then started the "Post-Install Configuration". It moves through the first two steps rather quickly, then seems hung up on "Disabling apps prior to installation".

I have 26 "messages" in the alert window, most of them are similar to the following:

User admin triggered the 'disable' action on app 'SA-NetworkProtection', and the following objects required a restart: distsearch, indexes

and then the following 2 specific messages:

Unable to initialize modular input "esdeploymentmanager" defined inside the app "SplunkEnterpriseSecuritySuite": Introspecting scheme=esdeploymentmanager: script running failed (exited with code 1).

Splunk must be restarted for changes to take effect. Click here to restart from the Manager.

Not sure what is specifically causing it to hang. Any tips on where to look or has anyone seen this similar issue?

Edit: Splunk has confirmed this is as a bug caused by a behavior in how the UI is reloading the Appserver in the background. I suspect this will no longer be an issue for anyone in a future release. Otherwise see the answer provided below on a workaround to try to force the update.

0 Karma
Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Contributor

I had the same issue. I ended up doing a splunk restart and retrying it several times until it finally got past it.

Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

Have tried that a few times and no luck 😞

0 Karma
Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Splunk Employee
Splunk Employee

I'm running into a similar issue doing a fresh installation, it hangs on the installing applications piece. I've restarted the installation 4 times so far with the same result.

0 Karma
Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

I have a support ticket opened with Splunk on the issue, if and when I get a resolution I'll certainly post it, maybe it will help others!

Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

(Sorry it wouldn't let me edit my original post... something about not enough reputation points...)
Update 1: As I have been digging through the logs, there are two log sets that are relevant to this: essinstaller2.log and esinstallercontroller.log

Inside the esinstallercontroller.log I am getting essentially a timeout error through the different run attempts (just pasting those errors here)

2015-11-04 11:54:29,549 INFO file=<string>:stage:61 | stage="disable_apps" status="failed" exc="SplunkdConnectionException(u"Error connecting to /services/apps/local/SA-ThreatIntelligence/disable: ('The read operation timed out',)",) is not JSON serializable"
2015-11-04 11:54:29,549 INFO stage="disable_apps" status="failed" exc="SplunkdConnectionException(u"Error connecting to /services/apps/local/SA-ThreatIntelligence/disable: ('The read operation timed out',)",) is not JSON serializable"
2015-11-04 13:14:41,319 INFO stage="disable_apps" status="failed" exc="SplunkdConnectionException(u"Error connecting to /services/apps/local/SA-NetworkProtection/disable: ('The read operation timed out',)",) is not JSON serializable"
2015-11-04 13:41:57,632 INFO stage="disable_apps" status="failed" exc="SplunkdConnectionException(u"Error connecting to /services/apps/local/DA-ESS-AccessProtection/disable: ('The read operation timed out',)",) is not JSON serializable"

and inside the essinstaller2.log at the end of each run when it "hangs" it just does this:

2015-11-04 13:41:57,632 INFO stage="disable_apps" app="TA-tippingpoint" status="True" msg="disable"
2015-11-04 13:41:57,632 INFO STAGE COMPLETE: "disable_apps"

back in the middle of the log you will get an error like the below, which seems to correspond with the later thrown error inside the "controller" log:

2015-11-04 13:40:45,209 INFO stage="disable_apps" app="DA-ESS-AccessProtection" status="False" msg="Splunkd daemon is not responding: (u"Error connecting to /services/apps/local/DA-ESS-AccessProtection/disable: ('The read operation timed out',)",)"

What this all means I couldn't tell you. What I will say is I did have what appeared to be one "clean" run at 12:13 - blah which showed none of the above errors, yet still just decided it would hang and do nothing...

2015-11-04 12:13:30,833 INFO STAGE STARTING: "refresh"
2015-11-04 12:13:30,834 INFO stage="refresh" msg="Installation record found"
2015-11-04 12:13:30,834 INFO stage="refresh" msg="State data loaded."
2015-11-04 12:13:30,834 INFO stage="refresh" app="SA-EndpointProtection" status="False" msg="currently installed version: 3.3.1"
........
2015-11-04 12:13:30,842 INFO STAGE COMPLETE: "refresh"
2015-11-04 12:13:30,866 INFO STAGE STARTING: "deprecate_apps"
2015-11-04 12:13:30,866 INFO stage="deprecate_apps" msg="Installation record found"
2015-11-04 12:13:30,866 INFO stage="deprecate_apps" msg="State data loaded."
2015-11-04 12:13:31,755 INFO stage="deprecate_apps" app="TA-juniper" status="True" msg="deprecated"
........
2015-11-04 12:13:36,200 INFO STAGE COMPLETE: "deprecate_apps"
2015-11-04 12:13:36,235 INFO STAGE STARTING: "disable_apps"
2015-11-04 12:13:36,235 INFO stage="disable_apps" msg="Installation record found"
2015-11-04 12:13:36,235 INFO stage="disable_apps" msg="State data loaded."
2015-11-04 12:13:37,647 INFO stage="disable_apps" app="Splunk_TA_windows" status="True" msg="disable"
........
2015-11-04 12:16:14,137 INFO STAGE COMPLETE: "disable_apps"

and the comparable lines from the "controller" log:

2015-11-04 12:09:24,802 INFO Entering method: order
2015-11-04 12:09:24,802 INFO Installation stage order retrieved: [{'name': 'refresh', 'description': 'Refreshing app information'}, {'name': 'deprecate_apps', 'description': 'Deprecating old apps'}, {'name': 'disable_apps', 'description': 'Disabling apps prior to installation'}, {'name': 'install_apps', 'description': 'Installing new apps'}, {'name': 'reenable_apps', 'description': 'Re-enabling apps'}, {'name': 'postinstall', 'description': 'Conducting postinstall actions'}, {'name': 'finalize', 'description': 'Finalizing installation'}]
2015-11-04 12:13:29,527 INFO Entering method: order
2015-11-04 12:13:29,527 INFO Installation stage order retrieved: [{'name': 'refresh', 'description': 'Refreshing app information'}, {'name': 'deprecate_apps', 'description': 'Deprecating old apps'}, {'name': 'disable_apps', 'description': 'Disabling apps prior to installation'}, {'name': 'install_apps', 'description': 'Installing new apps'}, {'name': 'reenable_apps', 'description': 'Re-enabling apps'}, {'name': 'postinstall', 'description': 'Conducting postinstall actions'}, {'name': 'finalize', 'description': 'Finalizing installation'}]
2015-11-04 12:13:30,833 INFO Entering method: stage
2015-11-04 12:13:30,833 INFO stage="refresh" status="starting"
2015-11-04 12:13:30,843 INFO stage="refresh" status="completed"
2015-11-04 12:13:30,866 INFO Entering method: stage
2015-11-04 12:13:30,866 INFO stage="deprecate_apps" status="starting"
2015-11-04 12:13:36,200 INFO stage="deprecate_apps" status="completed"
2015-11-04 12:13:36,234 INFO Entering method: stage
2015-11-04 12:13:36,234 INFO stage="disable_apps" status="starting"
2015-11-04 12:16:14,137 INFO stage="disable_apps" status="completed"

So I am still over here scratching my head if there was a "clean" run through with no errors why did it still hit the "disable_apps" completed, yet never actually PROGRESS the web interface... The solution may be in the splunkd.log, as it has the following errors right after I get my last error from the "controller" log.

11-04-2015 13:41:57.633 -0500 WARN  HttpListener - Socket error from 127.0.0.1 while accessing /en-US/custom/SplunkEnterpriseSecuritySuite/installer_controller/disable_apps: Broken pipe
(web_service.log)
"2015-11-04 13:41:57,633 INFO   [563a50edc37f76689ceb10] root:129 - ENGINE: Bus STOPPING"

So that was after the last run (with the other errors) so lets look at the one that "seemingly" should have succeeded...

11-04-2015 12:16:14.138 -0500 ERROR HttpListener - Exception while processing request from 127.0.0.1 for /en-US/custom/SplunkEnterpriseSecuritySuite/installer_controller/disable_apps: Connection closed by peer
11-04-2015 12:16:14.139 -0500 ERROR HttpListener - Handler for /en-US/custom/SplunkEnterpriseSecuritySuite/installer_controller/disable_apps sent a 0 byte response after earlier claiming a Content-Length of 3055!

(web_service.log)
2015-11-04 12:16:14,139 INFO    [563a3b87897facc4ad6ad0] root:129 - ENGINE: Bus STOPPING

Anyway, I am at a loss here, still... It looks like there is just something wrong with the way the installer is coded, but I am not getting clear enough error messages to know what to fix. Why can't we have a manual installation route?

0 Karma
Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

So still waiting to see if Splunk comes back with a Root Cause on this but after working with support on the phone for a couple hours the Dev team suggested that I try the following CLI command:

./splunk search '| testessinstall'

This allowed me to essentially run the install from the command line and use the admin account. For some reason every time I was using my account credentials it seems to be causing it to fail. This is either tied to somehow having mixed up permissions even though the admin account has the same "admin" role assigned to it that my user account does, or it is somehow tied to the fact that I am using Single Sign-On with my user accounts instead of username/password...

I am inclined to think this is somehow a permissions issue, in the meantime hopefully if someone else runs into this issue getting stuck on the "disabling apps" part, that maybe trying to run it from the built-in admin account helps. (note that the CLI was necessary in my case since username/password log in is disabled on the Web GUI for me, recommend just trying this from the built-in admin account if you run into trouble, then trying it from the CLI if that still isn't working.)

Good Luck. If I hear anything further I'll be sure to share!

View solution in original post

Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Splunk Employee
Splunk Employee

Are there any followup commands to ./splunk search ' | testtessinstall' ? That just prompts me for the admin user/pass and then fails out saying testtessinstall is the first command of a search.

0 Karma
Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

You have it typed as "testtessinstall" there is just one "t". Otherwise no, the username/password prompt should be normal (although you could also do -auth admin -password "changme" on the end of it. Just make sure that you have the single quotes around the search command as I listed above and it should work fine.

The only output I got after running it (and this took like a good 30 minutes to complete mind you) was the following:

ERROR: command="testessinstall", Initialization complete, please restart Splunk

You should also see various messages showing up in your logs, as well as messages in the GUI next to your Display Name.

Since you commented earlier about getting stuck on the installing Apps piece you may be experiencing a slightly different problem and it might be worth opening a support ticket with Splunk if you haven't already.

0 Karma
Highlighted

Re: Upgrade Enterprise Security from 3.3.x to 4.0 hangs on "Disabling Apps"

Communicator

I also had same issue while upgrading ES from 3.3 to 4.1.
./splunk search '| testessinstall' worked for me.
Awesome and thanks for the workaround.

0 Karma