Hi, We’re currently facing a load imbalance issue in our Splunk deployment and would appreciate any advice or best practices. Current Setup: Universal Forwarders (UFs) → Heavy Forwarders (HFs) → Cribl We originally had 8 HFs handling parsing and forwarding. Recently, we added 6 new HFs (total of 14 HFs) to help distribute the load more evenly and to offload congested older HFs. All HFs are included in the UFs’ outputs.conf under the same TCP output group. Issue: We’re seeing that some of the original 8 HFs are still showingblocked=true in metrics.log (splunktcpin queue full), while the newly added HFs have little to no traffic. It looks like the load is not being evenly distributed across the available HFs. Here's our current outputs.conf deployed in UFs: [tcpout] defaultGroup = HF_Group forwardedindex.2.whitelist = (_audit|_introspection|_internal) [tcpout:HF_Group] server = HF1:9997,HF2:9997,...HF14:9997 We have not set autoLBFrequency yet. Questions: Do we need to set autoLBFrequency in order to achieve true active load balancing across all 14 HFs, even when none of them are failing? If we set autoLBFrequency = 30, are there any potential downsides (e.g., performance impact, TCP session churn)? Are there better or recommended approaches to ensure even distribution of UF traffic to multiple HFs in environments before forwarding to Cribl? Please note that we are sending a large volume of data, primarily wineventlogs. Your help is very much appreciated. Thank you
... View more