There is a calculation error with number bytes used by a given connection while logging. This log message is false positive.
Use following workaround to suppress the log.
set in $SPLUNK_HOME/etc/log-local.cfg
category.AutoLoadBalancedConnectionStrategy=ERROR
Issue is fixed by splunk 9.1.3/9.2.1 where the log is logging correct value.
It's added to fixed issues
(SPL-248188, SPL-248140).
https://docs.splunk.com/Documentation/Splunk/9.2.1/ReleaseNotes/Fixedissues
You may also want to checkout https://community.splunk.com/t5/Knowledge-Management/Slow-indexer-receiver-detection-capability/m-p/...
Not every bug get's added there as each release will have hundreds of issues.
I will get this added to
https://docs.splunk.com/Documentation/Splunk/9.1.3/ReleaseNotes/Fixedissues
You can reach out to support and get official confirmation about fixed version.
We see this exact issue and it started after upgrading to 9.2.0.1. Suppressing the warning works as expected but was curious if you found this specific to 9.2 we are upgrading from 9.0.5 so it may have been introduced in 9.1 as well.
Correct. This is applicable for 9.1.0 and above.
There is a calculation error with number bytes used by a given connection while logging. This log message is false positive.
Use following workaround to suppress the log.
set in $SPLUNK_HOME/etc/log-local.cfg
category.AutoLoadBalancedConnectionStrategy=ERROR
Issue is fixed by splunk 9.1.3/9.2.1 where the log is logging correct value.
Great, wonder if it's been re-introduced. We're on 9.2.2 and still getting flooded...
(edited) My bad, I thought they meant on the Heavy Forwarders -- our clients (UF) are still prior to 9.2.2.
The log is still there. But instead of huge number
Current dest host connection is using 18446603427033668018 bytes
the fix logs the correct number.
See https://community.splunk.com/t5/Getting-Data-In/Current-dest-host-connection-is-using-18446603427033...
Well, all of our servers are running 9.2.2 and all of our Universal Forwarders are running 9.2.1 or 9.2.2 and we are still seeing this log message.
EDIT 2024-07-23: Never mind. Closer inspection of the logs shows that they are working correctly in 9.2.1 and 9.2.2. The messages with the crazy high numbers are from older systems. The newer ones still report a number, but none larger than 1,000,000 (most around 512 kB).
You may want to increase maxQueueSize in outputs.conf. The logs is indicating blocking due to low tcpout queue size or target is causing back-pressure..
Hi. We just upgraded from 9.0.6 to 9.1.4 and are seeing these same warnings.
Do we know that this was fixed in 9.1.4?
You are going to see log that provides correct information about bytes used that is less than maxQueueSize. This is useful to find slow receivers and allows you to fix the configs to avoid queue blocking.
What is not expected is log reporting bytes more than maxQueueSize. Example 18446603427033668018 bytes.
Example below log is expected and you rather want to fix configs to not let one receiver use all queue.
WARN AutoLoadBalancedConnectionStrategy [xxxx TcpOutEloop] - Current dest host connection nn.nn.nn.nnn:9997, oneTimeClient=0, _events.size()=41, _refCount=2, _waitingAckQ.size()=5, _supportsACK=1, _lastHBRecvTime=Thu Jun 20 12:07:44 2023 is using 25214400 bytes. Total tcpout queue size is 26214400. Warningcount=841
Can you share the log?
Thanks @hrawat
The logs are as expected then
05-03-2024 17:46:52.999 +0000 WARN AutoLoadBalancedConnectionStrategy [24761 TcpOutEloop] - Current dest host connection 1.2.3.4:5678, oneTimeClient=0, _events.size()=993, _refCount=1, _waitingAckQ.size()=0, _supportsACK=0, _lastHBRecvTime=Fri May 3 17:46:48 2024 is using 475826 bytes. Total tcpout queue size is 512000. Warningcount=2001
@burwell wrote:Thanks @hrawat
The logs are as expected then
05-03-2024 17:46:52.999 +0000 WARN AutoLoadBalancedConnectionStrategy [24761 TcpOutEloop] - Current dest host connection 1.2.3.4:5678, oneTimeClient=0, _events.size()=993, _refCount=1, _waitingAckQ.size()=0, _supportsACK=0, _lastHBRecvTime=Fri May 3 17:46:48 2024 is using 475826 bytes. Total tcpout queue size is 512000. Warningcount=2001
Yes this is expected. This is providing you early warning that one connection is nearly using all the queue. If that indexer stops or during IDX RR, fwd will not be able to move to next indexer that is free. This log precisely finds a slow connection(indexer/receiver) or low maxQueueSize(Total tcpout queue size).
See https://community.splunk.com/t5/Knowledge-Management/Slow-indexer-receiver-detection-capability/m-p/...
It would be great if this is logged as an actual bug, or at least a known issue.
Some of us have several 1000 of UF's, spread across multiple environments, and updating the log-local.cfg just isn't feasible.
It's logged as a bug and fixed for 9.1.3/9.2.1
Any documentation on this error? I did not see it in any of the Release Notes or Fixed Issues
It's not added to release notes. But addressed by 9.1.3(released) and 9.21(not yet released)
Thanks, but I looked at both links below and see no mention of it...should I be looking somewhere else?