We have had a Standalone Search Head that has been getting duplicate events for all searches. He have found the cause and the fix, but I wanted to add this here in order to help anyone else that comes up with this kind of problem. Thanks to Tyler Germer and Duane (a.k.a. duckfez) for their help.
The symptoms are easy to describe. When doing a search there are exactly two events for every event that is returned on a search head from the cluster. Doing a dedup
on _raw
gives the right number of events with no duplicates. It didn't matter what index or source.
The search head was set up with the UI where the indexers were configured through Settings -> Distributed Search -> Search Peers
, instead of tying the search head to the cluster master, which will provide the peers automatically.
So the answer to this problem is to remove all the search peers from the Settings -> Distributed Search -> Search Peers
area of the UI, then connect the search head to the cluster master through the UI using Settings -> Indexer Clustering -> Enable Indexer Clustering
.
Here is Tyler Germer's explanation to this problem:
This sounds like the standalone search head may not be configured correctly for Index Clustering. To elaborate on that, when a Search Head is configured to connect to an Index Cluster, it connects to the Cluster Master, the Cluster Master tells it which Indexers are in the Cluster, and also which Indexers have the data the Search Head may be searching for. The Cluster Master basically controls all the requests for data, going in and out of the Indexer Cluster.
If instead, you add the individual Indexers manually through Settings / Distributed Search, the Search Head will communicate directly with the Indexers. Because your data is replicated between Indexers (Search Factor / Replication Faction), you have multiple copies of your data on multiple Indexers. When you search for that data, technically more than one Indexer has what you need, so all will respond with that, thus getting duplicate events.
The fix is to remove all Indexers from the list in Distributed Search, then go to Settings / Indexer Clustering, Enable Indexer Clustering, configure your Search Head as a Search Head Node, then enter in the Cluster Master URI, along with the Secret Key. Then the Search Peers will automatically be populated by the Cluster Master, AND in the future if you add / remove Indexers, the Cluster Master will automatically update that list for you.
So, the problem was in the configuration of the standalone search head. When you have a SA SH in a clustered environment, still connect it to the cluster so that you don't get the duplicate events because of replication.
@cpetterborg - Thanks so much for providing a solution to this issue. Do you think you can post your solution as an answer to be accepted below? That way your question does not look unresolved and it can be easily found by other users with the same issue. Thanks!
Ah-ha! Nevermind--I refreshed and your answer was there 🙂
So the answer to this problem is to remove all the search peers from the Settings -> Distributed Search -> Search Peers
area of the UI, then connect the search head to the cluster master through the UI using Settings -> Indexer Clustering -> Enable Indexer Clustering
.
Here is Tyler Germer's explanation to this problem:
This sounds like the standalone search head may not be configured correctly for Index Clustering. To elaborate on that, when a Search Head is configured to connect to an Index Cluster, it connects to the Cluster Master, the Cluster Master tells it which Indexers are in the Cluster, and also which Indexers have the data the Search Head may be searching for. The Cluster Master basically controls all the requests for data, going in and out of the Indexer Cluster.
If instead, you add the individual Indexers manually through Settings / Distributed Search, the Search Head will communicate directly with the Indexers. Because your data is replicated between Indexers (Search Factor / Replication Faction), you have multiple copies of your data on multiple Indexers. When you search for that data, technically more than one Indexer has what you need, so all will respond with that, thus getting duplicate events.
The fix is to remove all Indexers from the list in Distributed Search, then go to Settings / Indexer Clustering, Enable Indexer Clustering, configure your Search Head as a Search Head Node, then enter in the Cluster Master URI, along with the Secret Key. Then the Search Peers will automatically be populated by the Cluster Master, AND in the future if you add / remove Indexers, the Cluster Master will automatically update that list for you.
So, the problem was in the configuration of the standalone search head. When you have a SA SH in a clustered environment, still connect it to the cluster so that you don't get the duplicate events because of replication.