Hi everyone! Hoping I might be missing something simple.
We're running splunk enterprise 8.1.0 with the officially distributed docker image. All is well with our search head cluster, with one slightly difficult-to-track-down issue that has been causing frequent restarts of our search head tasks.
Everything starts up cleanly, we have a good search head cluster, UIs are returning results normally, etc. But, it appears that an ansible health check that runs at the very end of the playbook is failing to validate that the splunkweb UI is up and running (it is).
included: /opt/ansible/roles/splunk_search_head/tasks/../../../roles/splunk_common/tasks/wait_for_splunk_instance.yml for localhost Monday 23 August 2021 23:48:24 +0000 (0:00:00.045) 0:00:59.426 ********* FAILED - RETRYING: Check Splunk instance is running (60 retries left).
This will eventually fail after 60 retries and will force the container to restart, briefly disrupting the search head cluster.
We haven't overridden many options on the web.conf side aside from setting up ProxySSO (this was happening before configuring SSO also).
According to the file, this is the configured check: