Error downloading snapshot

dmcnulty · ‎06-04-2025

After running out of disk space on a search head (part of a cluster), now fixed and all SH's rebooted.

I get this error:

ConfReplicationException Error pulling configurations from the search head cluster captain (SH2:8089); Error in fetchFrom, at=: Non-200 status_code=500: refuse request without valid baseline; snapshot exists at op_id=xxxx6e8e for repo=SH2:8089".  Search head cluster member (SH3:8089) is having trouble pulling configs from the captain (SH2:8089).   xxxxx
Consider performing a destructive configuration resync on this search head cluster member.

Ran "splunk resync shcluster-replicated-config" and get this:

ConfReplicationException : Error downloading snapshot: Non-200 status_code=400: Error opening snapshot_file' /opt/splunk/var/run/snapshot/174xxxxxxxx82aca.bundle: No such file or directory.

In the snapshot folder there is nothing, sometimes a few files, they don't match the other search heads.

'splunk show bundle-replication-status' is all green and the same as the other 2 SH's.

Is there a force resync switch? Really can't remove this SH and run 'clean all'.

Thank you!

livehybrid · ‎06-04-2025

Hi @dmcnulty

The captain is refusing the sync request because the member doesn't have a valid baseline, and the subsequent resync attempt failed because a required snapshot file is missing or inaccessible.

The recommended action is to perform a destructive configuration resync on the affected member (SH3). This forces the member to discard its current replicated configuration and pull a fresh copy from the captain.

Run the following command on the affected search head member (SH3):

splunk resync shcluster-replicated-config --answer-yes

This command will discard the contents of $SPLUNK_HOME/etc/shcluster/apps and $SPLUNK_HOME/etc/shcluster/local on SH3 and attempt to fetch a complete, fresh copy from the captain.
Ensure the captain (SH2) is healthy and has sufficient disk space and resources before running this command.

If the destructive resync fails with the same or a similar error about a missing snapshot file, it might indicate a more severe issue with the captain's snapshot or the member's ability to process the bundle. If it fails then check the captain's splunkd.log for any specific errors around replication bundles. If the issue persists, removing the member from the cluster and re-adding it is the standard, albeit more disruptive, next step.

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

dmcnulty · ‎06-06-2025

I did run 'splunk resync shcluster-replicated-config' . I left it overnight and somehow SH3 sync'd itself. I also became the captain, which I changed back. Ran a sync on SH1 and all good now.

No clue how or why it resync'd itself after many failed tries and clean ups.

Error downloading snapshot

troubleshooting

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you a member of the Splunk Community?

Error downloading snapshot

troubleshooting

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...