Installation

DSP 1.2.1 installation failed to execute phase "/runtime"

sylim_splunk
Splunk Employee
Splunk Employee

I see below log in operation_install showing continuous failure to connect to https://gravity-site.kube-system.svc.cluster.local:3009/healthz.
================
Wed Nov 10 02:40:41 UTC [INFO] [DAPD02] Executing postInstall hook for site:6.1.48.
Created Pod "site-app-post-install-125088-zqsmd" in namespace "kube-system".

Container "post-install-hook" created, current state is "waiting, reason PodInitializing".

Pod "site-app-post-install-125088-zqsmd" in namespace "kube-system", has changed state from "Pending" to "Running".
Container "post-install-hook" changed status from "waiting, reason PodInitializing" to "running".

^[[31m[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Get https://gravity-site.kube-system.svc.cluster.local:3009/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
^[[0mContainer "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" restarted, current state is "running".

^[[31m[ERROR]: failed connecting to https://gravity-site.kube-system.svc.cluster.local:3009/healthz
Get https://gravity-site.kube-system.svc.cluster.local:3009/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
^[[0mContainer "post-install-hook" changed status from "running" to "terminated, exit code 255".

Container "post-install-hook" changed status from "terminated, exit code 255" to "waiting, reason CrashLoopBackOff".
================

The gravity cluster status after the installation failure:
================
[root@DAPD02 crashreport]# gravity status
Cluster name: charmingmeitner2182
Cluster status: degraded (application status check failed)
Application: dsp, version 1.2.1
Gravity version: 6.1.48 (client) / 6.1.48 (server)
Join token: b9b088ce63c0a703ee740ba5dfb380d
Periodic updates: Not Configured
Remote support: Not Configured
Last completed operation:
* 3-node install
ID: 46614e3c-fcd1-4974-8cd7-dc404d1880b
Started: Wed Nov 10 02:33 UTC (1 hour ago)
Completed: Wed Nov 10 02:35 UTC (1 hour ago)
Cluster endpoints:
* Authentication gateway:
- 10.69.80.1:32009
- 10.69.80.2:32009
- 10.69.89.3:32009
* Cluster management URL:
- https://10.69.80.1:32009
- https://10.69.80.2:32009
- https://10.69.89.3:32009
Cluster nodes:
Masters:
* DAPD02 / 10.69.80.1 / master
Status: healthy
[!] overlay packet loss for node 10.69.89.3 is higher than the allowed threshold of 20% (current packet loss at 100%)
[!] overlay packet loss for node 10.69.80.2 is higher than the allowed threshold of 20% (current packet loss at 100%)
Remote access: online
* DWPD03 / 10.69.80.2 / master
Status: healthy
[!] overlay packet loss for node 10.69.80.1 is higher than the allowed threshold of 20% (current packet loss at 100%)
[!] overlay packet loss for node 10.69.89.3 is higher than the allowed threshold of 20% (current packet loss at 100%)
Remote access: online
* DDPD04 / 10.69.89.3 / master
Status: healthy
[!] overlay packet loss for node 10.69.80.2 is higher than the allowed threshold of 20% (current packet loss at 100%)
[!] overlay packet loss for node 10.69.80.1 is higher than the allowed threshold of 20% (current packet loss at 100%)
Remote access: online
================

Labels (2)
Tags (1)
0 Karma
1 Solution

sylim_splunk
Splunk Employee
Splunk Employee

When I configured my test environment with this firewall settings, I could reproduce the same symptom
Then the installation completes successfully with the firewall settings below;


allow 53 for TCP
allow 53 for UDP
allow 8472 for UDP

View solution in original post

0 Karma

sylim_splunk
Splunk Employee
Splunk Employee

When I configured my test environment with this firewall settings, I could reproduce the same symptom
Then the installation completes successfully with the firewall settings below;


allow 53 for TCP
allow 53 for UDP
allow 8472 for UDP

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...