Getting Data In

[Splunk HTTP Event Collector] Having trouble connecting to HEC port.

sylim_splunk
Splunk Employee
Splunk Employee

We are often seeing the following error messages from HEC servers and users are complaining of failures connecting to HEC:

04-16-2020 19:02:04.513 +0000 WARN HttpListener - Socket error from 10.1.32.176:3655 while accessing /services/collector/event/1.0: Connection reset by peer
04-16-2020 19:21:51.387 +0000 WARN HttpListener - HTTP active connections down to 1354, unthrottling
04-16-2020 19:21:51.354 +0000 WARN HttpListener - 1365 HTTP connections active, throttling
0 Karma
1 Solution

sylim_splunk
Splunk Employee
Splunk Employee

The number 1365 is limited by the open files, 4096 in ulimit ( 3 X 1365 = 4095) . To avoid the connectivity issue itself you can increase the open files by modifying the etc/security/limits.conf in Linux. If you find it is there already then consider to add it to splunk start up script
i) initd script, add the below and change the start/restart to call the function

set_ulimit function to configure ulimits

set_ulimits() {
ulimit -Hn 65535
ulimit -Sn 65535
}

Add the fuction for start/restart

start)
set_ulimits
splunk_start
;;

restart)
set_ulimits
splunk_restart

ii) Under systemd you can increase it using LimitNOFILE
[Service]
Type=simple
Restart=always
ExecStart=/home/splunk/s734/bin/splunk _internal_launch_under_systemd
LimitNOFILE=655360

But 1300+ HEC connections are very high you may want to monitor the performance over the network, like netstat -an will show if the TCP Recv queue is building up or not. Then you may need to load balance the # of connections by spawning another HEC receiver.

View solution in original post

sylim_splunk
Splunk Employee
Splunk Employee

The number 1365 is limited by the open files, 4096 in ulimit ( 3 X 1365 = 4095) . To avoid the connectivity issue itself you can increase the open files by modifying the etc/security/limits.conf in Linux. If you find it is there already then consider to add it to splunk start up script
i) initd script, add the below and change the start/restart to call the function

set_ulimit function to configure ulimits

set_ulimits() {
ulimit -Hn 65535
ulimit -Sn 65535
}

Add the fuction for start/restart

start)
set_ulimits
splunk_start
;;

restart)
set_ulimits
splunk_restart

ii) Under systemd you can increase it using LimitNOFILE
[Service]
Type=simple
Restart=always
ExecStart=/home/splunk/s734/bin/splunk _internal_launch_under_systemd
LimitNOFILE=655360

But 1300+ HEC connections are very high you may want to monitor the performance over the network, like netstat -an will show if the TCP Recv queue is building up or not. Then you may need to load balance the # of connections by spawning another HEC receiver.

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...