I have two search conditions that I need to trigger alerts from. I have a hundred hosts on a HA cluster. Sometimes host(s) happen to leave an HA cluster and come back online, due to network issues or during a production changes by engineers. When a host leaves the HA cluster, I get a single message in Splunk that reads "serverX has gone out-of-sync". When the host joins back the HA cluster, I get a single message in Splunk that reads "serverX has gone in-sync". This means I have two search results to play with. My Goal: When a host leaves the HA cluster and comes back within an hour, do not send any alerts. But if a host leaves an HA cluster, but does not come back online after an hour, trigger an alert. Here is what I have done so far (search period =1hr): index=test sync_status="out-of-sync" [search index=test sync_status="in-sync" | dedup server | table server] I get undesired results. I expect to see only the host that went offline but did not join back the cluster (of which I can see results when I do simple searches). Am I in the right direction, from a search and logic perspective? Are they better search methods of doing it?
... View more