As above, I kicked off an update and the cluster master is stuck at "Bundle validation is in progress" and has been for several hours now.
If I restart the splunk service and try again it does the same thing. I've made minor changes to the configuration to push it through but it makes no difference.
In my situation I believe the cluster was 'out of sync' (my words not Splunks) due to a bad config being applied and restarting the cluster-master while attempting bundle validation etc.
I performed the following steps to resolve the issue:
On the cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - take note of the all the bundle-id's
On the cluster-manager edit something in your bundle so that it gets a new checksum (eg. add a comment to a file)
On the cluster-manager run: /opt/splunk/bin/splunk apply cluster-bundle
On cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - you should see the master's latest_bundle ID change. At this point you should see: "cluster_status=Bundle validation is in progress." in the output.
This is where it gets stuck, now at this point restart the splunk service on an indexer, watch the output of /opt/splunk/bin/splunk show cluster-bundle-status - it's latest_bundle ID should also change.
Once the above indexer restarts, continue to restart the splunk service on the remainder of your indexers in your cluster.
When you restart the Splunk service on the 'final' indexer the output from: /opt/splunk/bin/splunk show cluster-bundle-status - should show all indexers to have the same latest_bundle ID
At this point the cluster-master should then initate rolling restarts of the cluster to apply the config. At this point the active_bundle ID and latest_bundle ID should match up.
And hopefully your problem is now fixed.
PS I also removed /opt/splunk/etc/slave-apps.old, but that probably wasn't required.
I had something similar.
From 100 indexers, 8 of them have not changed the active_bundle value to the newest one.
I restarted them (took them offline and restarted them) that has helped to resolve this issue.
PS: from the cluster Master, you can run: /opt/splunk/bin/splunk show cluster-bundle-status to get the bundle status of your peers
Post Splunk 7.0 there is a simpler way to handle this. There is an endpoint that you can do a POST to, that cancels the bundle push operation, and gets the stuck cluster master out of the validate loop.
For more details, check these below links,
http://docs.splunk.com/Documentation/Splunk/7.0.0/RESTREF/RESTcluster#cluster.2Fmaster.2Fcontrol.2Fd...
http://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/Configurationbundleissues
Splunk 8.x.x here.
Profiling settings did block my apply bundle command.
/opt/splunk/bin/splunk apply cluster-bundle
Encountered some errors while applying the bundle.
Cannot apply (or) validate configuration settings. Bundle validation is in progress.
/opt/splunk/bin/splunk show cluster-bundle
...
<bundle_validation_errors on master>
...
This command did the trick:
curl -k -u admin https://CLUSTER_MASTER_IP:8089/services/cluster/master/control/default/cancel_bundle_push -X POST
And I could edit and apply the bundle afterwards.
Cancelling the bundle push didn't actually work for me. I had to restart (./splunk restart) the indexer peers one at a time. Rolling restart from the CM won't work either.
From - https://docs.splunk.com/Documentation/Splunk/6.4.4/Indexer/Configurationbundleissues
I added below contents and commented in INDEXER cluster master's server.conf
[sslConfig]
allowSslCompression = false
[clustering]
heartbeat_timeout = 600
and commented "pass4SymmKey"
RESTARTED splunk service, then apply cluster bundle
hope it helps.
In my situation I believe the cluster was 'out of sync' (my words not Splunks) due to a bad config being applied and restarting the cluster-master while attempting bundle validation etc.
I performed the following steps to resolve the issue:
On the cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - take note of the all the bundle-id's
On the cluster-manager edit something in your bundle so that it gets a new checksum (eg. add a comment to a file)
On the cluster-manager run: /opt/splunk/bin/splunk apply cluster-bundle
On cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - you should see the master's latest_bundle ID change. At this point you should see: "cluster_status=Bundle validation is in progress." in the output.
This is where it gets stuck, now at this point restart the splunk service on an indexer, watch the output of /opt/splunk/bin/splunk show cluster-bundle-status - it's latest_bundle ID should also change.
Once the above indexer restarts, continue to restart the splunk service on the remainder of your indexers in your cluster.
When you restart the Splunk service on the 'final' indexer the output from: /opt/splunk/bin/splunk show cluster-bundle-status - should show all indexers to have the same latest_bundle ID
At this point the cluster-master should then initate rolling restarts of the cluster to apply the config. At this point the active_bundle ID and latest_bundle ID should match up.
And hopefully your problem is now fixed.
PS I also removed /opt/splunk/etc/slave-apps.old, but that probably wasn't required.
You are a scholar and a gent, 7 years on and what you put is still being helpful.
Thank you so much
Still works 4.5 years later
still works 5.5 years later as well!
still works 7 years later.
champion
To confirm, the answer for me was restarting all the indexers while in maintenance mode. This as predicted by cam343 kicked off a rolling restart and moved all the indexers to the latest bundle version.
In my case, I had to also restart the master.
Until I did that, it kept showing me the Unsuccessful deployment and refuse to deploy the bundle.
Did the log files on either master or peers have any error messages?