Deployment Architecture

Search Head Cluster: Members in SHC pool get out of synch and error in log files check for clock skew

sat94541
Communicator

We have 3 Node SHC pool and the SHC is still frequently gets out-of-synch and keeps throwing the following UI banner message: "Error pulling configurations from the search head cluster captain; consider performing a destructive configuration resync on this search head cluster member."

These are the recommended setting changes implemented:
scheduling_heuristic = round_robin
captain_is_adhoc_searchhead = true
replication_factor = 1

12-14-2015 17:22:54.072 +0000 WARN ConfReplication - installed_snapshot="/ngs/app/splunkt/SHC/splunk/var/run/splunk/snapshot/1450111567-b0d62539eea238d3c00ccbe9f81601fd6675f5d9.bundle" has earlier timestamp than existing snapshot="/ngs/app/splunkt/SHC/splunk/var/run/splunk/snapshot/1450113494-f4754dfa40753dbce4014552b9f64dbc6c00844d.bundle"; check for clock skew

What does this error message mean? Could this be the cause of the issue?

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Another good question to check, are the clocks sync'd on all the members? If you have time drift, or the clocks are different, this always going to happen. Make sure your times are sycnd across all SHC members.

rbal_splunk
Splunk Employee
Splunk Employee

This WARN seems to be coming from when user is performing a destructive resync.
Most likely he timestamp for the installed_snapshot is coming from the captain's latest tarball. The timestamp for the "existing snapshot" is coming from the local member.
The message basically means that the latest snapshot from the captain has an earlier timestamp than the latest snapshot on the member, and hence destructive resync is "going back in time".

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.