We have a Splunk cluster where we have 1400 hosts with Universal Forwarders installed. These UFs are forwarding to two intermediate Heavy Forwarders using SSL and load balancing. The hosts aren't sending a lot of data. I would guess on average 3-4kbps.
The problem we are seeing is that all of the hosts (UF) are experiencing SSL error 10054. Which basically means that the HF has dropped the connection.
09-07-2016 20:29:19.439 +0000 INFO TcpOutputProc - Connection to XX.XX.XX.XX:9997 closed. default Error in SSL_read = 10054, SSL Error = error:00000000:lib(0):func(0):reason(0)
Has anyone experienced something similar? I guess I should mention that these hosts are connected to the network through a satellite link. Which means that latency and general network connectivity could also play a part in this.
100054, means "Socket forcefully shut down by remote host"
I've a similar error message on some UFs and it looks like this is du to a network problem (ie a very short network cut that would force the TCP session to be reset (because of network unreachable returning to host for example that would immediately break the session)
but it is a generic network error so that's not the only possibility.
Were you ever able to resolve this issue? I've got three UFs on a fairly high latency connection that report this error on a regular basis. I've implemented bandwidth restriction using limits.conf but still see the error logged.