Solved: Why is cluster master stuck at "Bundle validation ...

nwales · ‎08-21-2014

As above, I kicked off an update and the cluster master is stuck at "Bundle validation is in progress" and has been for several hours now.

If I restart the splunk service and try again it does the same thing. I've made minor changes to the configuration to push it through but it makes no difference.

cam343 · ‎10-08-2014

In my situation I believe the cluster was 'out of sync' (my words not Splunks) due to a bad config being applied and restarting the cluster-master while attempting bundle validation etc.

I performed the following steps to resolve the issue:

On the cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - take note of the all the bundle-id's
On the cluster-manager edit something in your bundle so that it gets a new checksum (eg. add a comment to a file)
On the cluster-manager run: /opt/splunk/bin/splunk apply cluster-bundle
On cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - you should see the master's latest_bundle ID change. At this point you should see: "cluster_status=Bundle validation is in progress." in the output.

This is where it gets stuck, now at this point restart the splunk service on an indexer, watch the output of /opt/splunk/bin/splunk show cluster-bundle-status - it's latest_bundle ID should also change.

Once the above indexer restarts, continue to restart the splunk service on the remainder of your indexers in your cluster.
When you restart the Splunk service on the 'final' indexer the output from: /opt/splunk/bin/splunk show cluster-bundle-status - should show all indexers to have the same latest_bundle ID
At this point the cluster-master should then initate rolling restarts of the cluster to apply the config. At this point the active_bundle ID and latest_bundle ID should match up.

And hopefully your problem is now fixed.

PS I also removed /opt/splunk/etc/slave-apps.old, but that probably wasn't required.

View solution in original post

dolbyjoab · ‎10-23-2020

I had something similar.

From 100 indexers, 8 of them have not changed the active_bundle value to the newest one.

I restarted them (took them offline and restarted them) that has helped to resolve this issue.

PS: from the cluster Master, you can run: /opt/splunk/bin/splunk show cluster-bundle-status to get the bundle status of your peers

kkolla_splunk · ‎10-09-2017

Post Splunk 7.0 there is a simpler way to handle this. There is an endpoint that you can do a POST to, that cancels the bundle push operation, and gets the stuck cluster master out of the validate loop.

For more details, check these below links,

http://docs.splunk.com/Documentation/Splunk/7.0.0/RESTREF/RESTcluster#cluster.2Fmaster.2Fcontrol.2Fd...
http://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/Configurationbundleissues

samsplunks · ‎08-12-2021

Splunk 8.x.x here.

Profiling settings did block my apply bundle command.

/opt/splunk/bin/splunk apply cluster-bundle

Encountered some errors while applying the bundle.
Cannot apply (or) validate configuration settings. Bundle validation is in progress.


/opt/splunk/bin/splunk show cluster-bundle
...
<bundle_validation_errors on master>
...

This command did the trick:

curl -k -u admin https://CLUSTER_MASTER_IP:8089/services/cluster/master/control/default/cancel_bundle_push -X POST

And I could edit and apply the bundle afterwards.

wweiland · ‎05-22-2019

Cancelling the bundle push didn't actually work for me. I had to restart (./splunk restart) the indexer peers one at a time. Rolling restart from the CM won't work either.

siddesh333 · ‎10-14-2019

From - https://docs.splunk.com/Documentation/Splunk/6.4.4/Indexer/Configurationbundleissues
I added below contents and commented in INDEXER cluster master's server.conf

[sslConfig]
allowSslCompression = false

[clustering]
heartbeat_timeout = 600

and commented "pass4SymmKey"

RESTARTED splunk service, then apply cluster bundle
hope it helps.

cam343 · ‎10-08-2014

In my situation I believe the cluster was 'out of sync' (my words not Splunks) due to a bad config being applied and restarting the cluster-master while attempting bundle validation etc.

I performed the following steps to resolve the issue:

On the cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - take note of the all the bundle-id's
On the cluster-manager edit something in your bundle so that it gets a new checksum (eg. add a comment to a file)
On the cluster-manager run: /opt/splunk/bin/splunk apply cluster-bundle
On cluster-manager run: /opt/splunk/bin/splunk show cluster-bundle-status - you should see the master's latest_bundle ID change. At this point you should see: "cluster_status=Bundle validation is in progress." in the output.

This is where it gets stuck, now at this point restart the splunk service on an indexer, watch the output of /opt/splunk/bin/splunk show cluster-bundle-status - it's latest_bundle ID should also change.

Once the above indexer restarts, continue to restart the splunk service on the remainder of your indexers in your cluster.
When you restart the Splunk service on the 'final' indexer the output from: /opt/splunk/bin/splunk show cluster-bundle-status - should show all indexers to have the same latest_bundle ID
At this point the cluster-master should then initate rolling restarts of the cluster to apply the config. At this point the active_bundle ID and latest_bundle ID should match up.

And hopefully your problem is now fixed.

PS I also removed /opt/splunk/etc/slave-apps.old, but that probably wasn't required.

willsy · ‎06-11-2021

You are a scholar and a gent, 7 years on and what you put is still being helpful.

Thank you so much

wweiland · ‎05-22-2019

Still works 4.5 years later

BainM · ‎05-26-2020

still works 5.5 years later as well!

willsy · ‎06-11-2021

still works 7 years later.

champion

nwales · ‎11-05-2014

To confirm, the answer for me was restarting all the indexers while in maintenance mode. This as predicted by cam343 kicked off a rolling restart and moved all the indexers to the latest bundle version.

sansay · ‎06-05-2017

In my case, I had to also restart the master.
Until I did that, it kept showing me the Unsuccessful deployment and refuse to deploy the bundle.

mahamed_splunk · ‎09-09-2014

Did the log files on either master or peers have any error messages?

Why is cluster master stuck at "Bundle validation is in progress" indefinitely after configuration-bundle update?

indexer clustering

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation