Getting Data In

[Splunk HTTP Event Collector] Having trouble connecting to HEC port.

sylim_splunk
Splunk Employee
Splunk Employee

We are often seeing the following error messages from HEC servers and users are complaining of failures connecting to HEC:

04-16-2020 19:02:04.513 +0000 WARN HttpListener - Socket error from 10.1.32.176:3655 while accessing /services/collector/event/1.0: Connection reset by peer
04-16-2020 19:21:51.387 +0000 WARN HttpListener - HTTP active connections down to 1354, unthrottling
04-16-2020 19:21:51.354 +0000 WARN HttpListener - 1365 HTTP connections active, throttling
0 Karma
1 Solution

sylim_splunk
Splunk Employee
Splunk Employee

The number 1365 is limited by the open files, 4096 in ulimit ( 3 X 1365 = 4095) . To avoid the connectivity issue itself you can increase the open files by modifying the etc/security/limits.conf in Linux. If you find it is there already then consider to add it to splunk start up script
i) initd script, add the below and change the start/restart to call the function

set_ulimit function to configure ulimits

set_ulimits() {
ulimit -Hn 65535
ulimit -Sn 65535
}

Add the fuction for start/restart

start)
set_ulimits
splunk_start
;;

restart)
set_ulimits
splunk_restart

ii) Under systemd you can increase it using LimitNOFILE
[Service]
Type=simple
Restart=always
ExecStart=/home/splunk/s734/bin/splunk _internal_launch_under_systemd
LimitNOFILE=655360

But 1300+ HEC connections are very high you may want to monitor the performance over the network, like netstat -an will show if the TCP Recv queue is building up or not. Then you may need to load balance the # of connections by spawning another HEC receiver.

View solution in original post

sylim_splunk
Splunk Employee
Splunk Employee

The number 1365 is limited by the open files, 4096 in ulimit ( 3 X 1365 = 4095) . To avoid the connectivity issue itself you can increase the open files by modifying the etc/security/limits.conf in Linux. If you find it is there already then consider to add it to splunk start up script
i) initd script, add the below and change the start/restart to call the function

set_ulimit function to configure ulimits

set_ulimits() {
ulimit -Hn 65535
ulimit -Sn 65535
}

Add the fuction for start/restart

start)
set_ulimits
splunk_start
;;

restart)
set_ulimits
splunk_restart

ii) Under systemd you can increase it using LimitNOFILE
[Service]
Type=simple
Restart=always
ExecStart=/home/splunk/s734/bin/splunk _internal_launch_under_systemd
LimitNOFILE=655360

But 1300+ HEC connections are very high you may want to monitor the performance over the network, like netstat -an will show if the TCP Recv queue is building up or not. Then you may need to load balance the # of connections by spawning another HEC receiver.

Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...