- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Slow indexer/receiver detection capability

9.1.3/9.2.1 onwards slow indexer/receiver detection capability is fully functional now (SPL-248188, SPL-248140).
https://docs.splunk.com/Documentation/Splunk/9.2.1/ReleaseNotes/Fixedissues
You can enable it on forwarding side in outputs.conf
maxSendQSize = <integer> * The size of the tcpout client send buffer, in bytes. If tcpout client(indexer/receiver connection) send buffer is full, a new indexer is randomly selected from the list of indexers provided in the server setting of the target group stanza. * This setting allows forwarder to switch to new indexer/receiver if current indexer/receiver is slow. * A non-zero value means that max send buffer size is set. * 0 means no limit on max send buffer size. * Default: 0
Additionally 9.1.3/9.2.1 and above will correctly log target ipaddress causing tcpout blocking.
WARN AutoLoadBalancedConnectionStrategy [xxxx TcpOutEloop] - Current dest host connection nn.nn.nn.nnn:9997, oneTimeClient=0, _events.size()=20, _refCount=2, _waitingAckQ.size()=4, _supportsACK=1, _lastHBRecvTime=Thu Jan 20 11:07:43 2024 is using 20214400 bytes. Total tcpout queue size is 26214400. Warningcount=20
Note: This config works correctly starting 9.1.3/9.2.1. Do not use it with 9.2.0/9.1.0/9.1.1/9.1.2( there is incorrect calculation https://community.splunk.com/t5/Getting-Data-In/Current-dest-host-connection-is-using-18446603427033...).
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

This setting definitely looks useful for slow receivers, but how would I determine when to use it, and an appropriate value?
For example you have mentioned:
WARN AutoLoadBalancedConnectionStrategy [xxxx TcpOutEloop] - Current dest host connection nn.nn.nn.nnn:9997, oneTimeClient=0, _events.size()=20, _refCount=2, _waitingAckQ.size()=4, _supportsACK=1, _lastHBRecvTime=Thu Jan 20 11:07:43 2024 is using 20214400 bytes. Total tcpout queue size is 26214400. Warningcount=20
I note that you have Warningcount=20, a quick check in my environment shows Warningcount=1, if i'm just seeing the occasional warning I'm assuming tweaking this setting would be of minimal benefit?
Furthermore, how would I appropriately set the bytes value?
I'm assuming it's per-pipeline, and the variables involved might relate to volume per-second per-pipline, any other variables?
Any example of how this would be tuned and when?
Thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

If warning count is 1, then it's not a big issue.
What it indicates is out of maxQueueSize bytes tcpout queue, one connection has occupied a large space. Thus TcpOutputProcessor will get pauses. maxQueueSize is per pipeline and is shared by all target connections per pipeline.
You may want to increase maxQueueSize( double the size).
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, I'll review the maxQueueSize
If the warning count was higher, such as 20 in your example.
What would be the best way to determine a good value (in bytes) for maxSendQSize to avoid the slow indexer scenario?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

If Warningcount is high, then I would like to see if target receiver/indexer is putting back-pressure. Check if queues blocked on target. If queues not blocked, check on target using netstat
netstat -an|grep <splunktcp port>
and see RECV Q, if it's high. If receiver queues are not blocked, but netstat shows RECV Q is full, then receiver need additional pipelines.
If Warningcount is high because there was rolling restart at indexing tier, then set maxSendQSize to some 5% value of maxQueueSize.
Example
maxSendQSize=2000000
maxQueueSize=50MB
If using autoLBVolume, then have
maxQueueSize > 5 x autoLBVolume
autoLBVolume > maxSendQSize
Example
maxQueueSize=50MB
autoLBVolume=5000000
maxSendQSize=2000000
maxSendQSize is total outstanding raw size of events/chunks in connection queue that needs to be sent to TCP Send-Q. This happens generally when TCP Send-Q is already full.
autoLBVolume is minimum total raw size of events/chunks to be sent to a connection.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

One minor request, if this logging is ever enhanced can it please include the output group name.
05-16-2024 03:18:05.992 +0000 WARN AutoLoadBalancedConnectionStrategy [85268 TcpOutEloop] - Current dest host connection <ip address>:9997, oneTimeClient=0, _events.size()=56156, _refCount=1, _waitingAckQ.size()=0, _supportsACK=0, _lastHBRecvTime=Thu May 16 03:18:03 2024 is using 31477941 bytes. Total tcpout queue size is 31457280. Warningcount=1001
Is helpful, however the destination IP happens to be istio (K8s software load balancer) and I have 3 indexer clusters with different DNS names on the same IP/port (the incoming DNS name determines which backend gets used). So my only way to "guess" the outputs.conf stanza involved is to set a unique queue size for each one so I can determine which indexer cluster / output stanza is having the high warning count.
If it had tcpout=<stanzaname> or similar in the warning that would be very helpful for me.
Thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Now logging includes output group from 9.4
03-06-2025 11:38:16.306 +1300 WARN AutoLoadBalancedConnectionStrategy [199656 TcpOutEloop] - Current dest host connection=10.231.218.59:9997, connid=1, oneTimeClient=0, _events.size()=53605, _refCount=1, _waitingAckQ.size()=0, _supportsACK=0, _lastHBRecvTime=Fri Mar 6 11:35:48 2025 for group=myindexers is using 31391960 bytes. Total tcpout queue size is 31457280. Warningcount=0
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Is this actual WARN log message you found?
If yes, what was the reason for back-pressure?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Yes that's the actual WARN message, the worst I've seen is a warning count of 9001 with a 150MB queue, the forwarder itself forwards a peak of over 100MB/s
05-21-2024 18:48:47.099 +1000 WARN AutoLoadBalancedConnectionStrategy [264180 TcpOutEloop] - Current dest host connection 10.x.x.x:9997, oneTimeClient=0, _events.size()=131822, _refCount=1, _waitingAckQ.size()=0, _supportsACK=0, _lastHBRecvTime=Tue May 21 18:48:36 2024 is using 157278423 bytes. Total tcpout queue size is 157286400. Warningcount=9001
That went from Warningcount=1 at 18:48:38.538 to Warningcount=1001 at 18:48:38.771
Then 18:48:38.90 has 2001
18:48:39.033 has 3001
18:48:39.134 has 4001
18:48:39.200 has 5001
18:48:39.336 has 6001
18:48:39.553 has 7001
18:48:46.500 has 8001 and finally:
18:48:47.099 has 9001
I suspect the backpressure is caused by an istio pod failure in K8s. I haven't tracked down the cause but I've seen some cases where the istio ingress gateways pods in K8s are in a "not ready" state, however I suspect they were alive enough to take on traffic.
During this time period I will sometimes see higher than normal Warningcount= entries *and* often around the same time my website availability checks start failing to DNS names that are pointed to istio pods.
My current suspect is that's it's not just a Splunk-level backpressure but I'll keep investigating (at the time the indexing tier shows the most utilised TCP input queues were at 67% using a max() measurement on their metrics.log.
The vast majority of my Warningcount= entries on this forwarder show a value of 1.
The configuration for this instance is:
maxQueueSize = 150MB
autoLBVolume = 10485760
autoLBFrequency = 1
dnsResolutionInterval = 259200
# tweaks the connectionsPerTarget = 2 * approx number of indexers
connectionsPerTarget = 96
# As per NLB tuning
heartbeatFrequency = 10
connectionTTL = 75
connectionTimeout = 10
autoLBFrequency = 1
maxSendQSize = 400000
# default 30 seconds, we can retry more quickly with istio as we should move to a new instance if it goes down
backoffOnFailure = 5
The maxSendQSize was tuned for a much lower volume forwarder and I forgot to update it for this instance, so I will increase that, and this instance appears to have increased from 30-50MB/s to closer to 100MB/s so I'll increase the autoLBVolume setting as well
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

That's great feedback. We will add output group.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thankyou very much for the detailed reply, that gives me enough to action now.
I appreciate the contributions to the community in this way.
