On a master node, the clustering dashboard has a column called 'status' for indexers and search heads. They're either 'up', or several other statuses. I would like to write a search that replicates that output, that I can execute every minute or two, so that I can provide an alert if one of the components in my clustered infrastructure fails, so my operations team can remedy the situation. There's even a 'last heartbeat' for index nodes which is at most 5 seconds old - this value would be great too! I want to do the same for the forwarder manager forwarder phone home status too.
Does anyone know which of the million entries in _internal or _audit might help with providing this status? Or is it somewhere else?
Any pointers appreciated, I've been looking at the _internal logs and am going blind. Thanks.
For the cluster peers you can query this endpoint: http://docs.splunk.com/Documentation/Splunk/6.1.2/RESTAPI/RESTcluster#cluster.2Fmaster.2Fpeers
| rest /services/cluster/master/peers
status field, and a
last_heartbeat as well.