I have a question about search heads and search peers. I have search heads and search peers in each of two datacenters. The plan is to configure the search peers (indexers) to receive data from sources that are local to that datacenter. That is, events from servers in datacenter A would go to search peers located in datacenter A.
Search heads also exist in both datacenters. While users will predominantly be searching through data that is local to that datacenter, there's no guarantee of that. If I believe that users could be searching through events that exist anywhere, I need to point them at all available search peers, correct? So search heads in datacenter A would be configured to know about all the search peers in datacenter A and all the search peers in datacenter B?
The datacenters are accessible to each other via a WAN (of course). I would assume that a search in datacenter A would trigger some traffic to datacenter B, but if there were no matching events from datacenter B, then the traffic would be very little? Or is Splunk even smarter than that about knowing where to send search requests?
My concern is that I want to limit traffic across the WAN when it's not strictly needed. Obviously I can prevent it by not adding search peers to search heads in different datacenters, but it would seem that I would also be preventing the possibility of all cross-datacenter searches which is also not what I want.
Thanks
The search heads delegate the queries to each indexer separately and only the results are returned back, to be aggregated for display by the search head. If there are no valid results for the query there is very little traffic. But yes, if you want the facility to perform searches on all your indexed content from either location, they must be configured to delegate to all indexers at all locations.
You could consider limiting access to data by user/role, so that only authorised users may perform searches across all sites. And you can default it so that users only have their local indexes and have to expressly broaden the searches to cross boundaries. Quite how you could get that to work if your indexes are identically named in both locations, I am not sure.
The search heads delegate the queries to each indexer separately and only the results are returned back, to be aggregated for display by the search head. If there are no valid results for the query there is very little traffic. But yes, if you want the facility to perform searches on all your indexed content from either location, they must be configured to delegate to all indexers at all locations.
You could consider limiting access to data by user/role, so that only authorised users may perform searches across all sites. And you can default it so that users only have their local indexes and have to expressly broaden the searches to cross boundaries. Quite how you could get that to work if your indexes are identically named in both locations, I am not sure.
OK, so effectively if there's data on search peers in different datacenters that I don't expect to be always accessed, but I know that users will sometimes access data across datacenters dependingon what they need to access, I should go ahead and add all search peers to all search heads. While there will be WAN accesses in this case, it will be minimal if there are no events on a remote search peer. I think I've got it.
Thanks!
Not quite. Search head says to all indexers: I want everything you have for "search x"; indexers say "here are my results" but that return traffic will be very little if they return empty results.
I didn't mean to imply that I'd have indexes with the same names across datacenters. Other than "main", I guess. So essentially you're saying that the search heads would be smart about what indexes are where or worst case, the amount of traffic generated to remote search peers would be small if the data wasn't there?
It seems like the answer here really is "yes, add all search peers in this scenario to all search heads and Splunk will take care of things intelligently". If I want to actually restrict access, I can do more on top of that. Thanks.
Thanks. Both datacenters would NOT have the same data. The indexers in each datacenter would contain events from only "local" servers.
i think splunk will search for the metadata for a return search result which will not take much traffic. We would also like to know how the architecture is from the indexing point of view. if both the DC are containing the same indexes they should be interconnected as it will make the searches even faster. Again this is my perception. I will also wait for the heavy weights to answer this. Thanks