I am trying to figure out how I can measure the latency that my search head cluster nodes are experiencing between each other.
The configuration of the search head cluster is Splunk 6.6.3, all servers are Windows Server 2012 R2, 2 of the members are in 1 data center (along with the deployer) and the other node is in another data center.
The search head cluster has been up for a while and was running without any real issue. But, after this months Windows security patching and reboots, the captain fails over to a different member pretty regularly. Before, it was only failing over to another member when we were performing work on the cluster.
I am figuring that the issue has to do with latency between the cluster members and want to query the metric.
And if anyone has any other ideas why it might all of a sudden start having this issue (I have other stand alone search heads which got the same security patches and are having no issues).