Deployment Architecture

Search Head Cluster: Members in SHC pool get out of synch and error in log files check for clock skew

Communicator

We have 3 Node SHC pool and the SHC is still frequently gets out-of-synch and keeps throwing the following UI banner message: "Error pulling configurations from the search head cluster captain; consider performing a destructive configuration resync on this search head cluster member."

These are the recommended setting changes implemented:
schedulingheuristic = roundrobin
captainisadhocsearchhead = true
replication
factor = 1

12-14-2015 17:22:54.072 +0000 WARN ConfReplication - installed_snapshot="/ngs/app/splunkt/SHC/splunk/var/run/splunk/snapshot/1450111567-b0d62539eea238d3c00ccbe9f81601fd6675f5d9.bundle" has earlier timestamp than existing snapshot="/ngs/app/splunkt/SHC/splunk/var/run/splunk/snapshot/1450113494-f4754dfa40753dbce4014552b9f64dbc6c00844d.bundle"; check for clock skew

What does this error message mean? Could this be the cause of the issue?

0 Karma

Splunk Employee
Splunk Employee

Another good question to check, are the clocks sync'd on all the members? If you have time drift, or the clocks are different, this always going to happen. Make sure your times are sycnd across all SHC members.

Splunk Employee
Splunk Employee

This WARN seems to be coming from when user is performing a destructive resync.
Most likely he timestamp for the installed_snapshot is coming from the captain's latest tarball. The timestamp for the "existing snapshot" is coming from the local member.
The message basically means that the latest snapshot from the captain has an earlier timestamp than the latest snapshot on the member, and hence destructive resync is "going back in time".

0 Karma