Solved: Splunk Forwarder Connection Refused from Splunk In...

BP9906 · ‎06-06-2012

Ever since we added a few more Splunk Forwarders to our environment, the Splunk Server (search head, indexer, deployment server, Windows box) stopped accepting connections from the Forwarders.

We have around 30 forwarders total, all going to the Splunk server.

Splunk server is now 4.3.2 and no change. Restarting the Splunk server helps for about 2 minutes, then the agents reconnect and then end up in a failed state after a couple minutes.

Forwarder splunkd.log shows:

06-06-2012 11:27:11.884 -0700 INFO TcpOutputProc - Connected to idx=splunkserver:9997 06-06-2012 11:27:11.885 -0700 INFO TcpOutputProc - Connected to idx=splunkserver:9997 06-06-2012 11:28:03.981 -0700 INFO BatchReader - Removed from queue file='/opt/splunkforwarder/var/log/splunk/metrics.log.2'. 06-06-2012 11:29:41.070 -0700 INFO BatchReader - Removed from queue file='/opt/splunkforwarder/var/log/splunk/metrics.log.5'. 06-06-2012 11:29:55.226 -0700 WARN TcpOutputFd - Connect to splunkserver:9997 failed. Connection refused 06-06-2012 11:29:55.226 -0700 ERROR TcpOutputFd - Connection to host=splunkserver:9997 failed 06-06-2012 11:29:55.226 -0700 WARN TcpOutputFd - Connect to splunkserver:9997 failed. Connection refused 06-06-2012 11:29:55.226 -0700 ERROR TcpOutputFd - Connection to host=splunkserver:9997 failed 06-06-2012 11:29:55.226 -0700 INFO TcpOutputProc - Detected connection to splunkserver:9997 closed 06-06-2012 11:29:55.226 -0700 INFO TcpOutputProc - Detected connection to splunkserver:9997 closed 06-06-2012 11:29:56.553 -0700 WARN TcpOutputFd - Connect to splunkserver:9997 failed. Connection refused 06-06-2012 11:29:56.553 -0700 ERROR TcpOutputFd - Connection to host=splunkserver:9997 failed 06-06-2012 11:29:56.553 -0700 WARN TcpOutputFd - Connect to splunkserver:9997 failed. Connection refused 06-06-2012 11:29:56.553 -0700 ERROR TcpOutputFd - Connection to host=splunkserver:9997 failed 06-06-2012 11:29:56.553 -0700 WARN TcpOutputProc - Applying quarantine to idx=splunkserver:9997 numberOfFailures=2 06-06-2012 11:29:56.553 -0700 WARN TcpOutputProc - Applying quarantine to idx=splunkserver:9997 numberOfFailures=2 06-06-2012 11:30:25.221 -0700 INFO TcpOutputProc - Removing quarantine from idx=splunkserver:9997

Splunk Server splunkd.log doesnt show much related to the inbound connections. Perhaps a debug flag needs to be set?

Any ideas?

BP9906 · ‎06-11-2012

Solution found!

Etc/system/local/inputs.conf

[splunktcp://9997]
connection_host = none

restart splunk server and its fixed. DNS was holding it all up.

View solution in original post

BP9906 · ‎06-11-2012

Solution found!

Etc/system/local/inputs.conf

[splunktcp://9997]
connection_host = none

restart splunk server and its fixed. DNS was holding it all up.

Fernandisstepha · ‎08-31-2020

Hello Team,

I did same, you all suggested, but it doesn't work me

Etc/system/local/inputs.conf

[splunktcp://9997]
connection_host = none

Any other work around?

Regard

Steven

muez · ‎05-06-2020

Where to keep these settings?
My 2 Heavy forwarders, cluster master or all of my 10 indexers?
@BP9906 @lrudolph @msclimenti

BP9906 · ‎05-06-2020

From the documentation it says it can be put at various levels in the inputs.conf.
I find it easier to set connection_host = ip since it does not perform reverse dns lookup and you get the IP if the hostname is not provided via the splunkforwarder (ie if its syslog or something).

To answer your question, you would want to review the connection_host setting on any receiving end which would be your heavy forwarders and indexers.

woodcock · ‎05-06-2020

On the indexers.

dstaulcu · ‎04-25-2014

Did you ever find out why DNS resolution became a problem?

msclimenti · ‎01-28-2016

Not sure how you figured this out but thanks a ton!!!

BP9906 · ‎06-06-2012

I thought I'd also add that telnet splunkserver 9997 shows connection refused.
When I'm on the splunkserver box directly and do telnet localhost 9997 I get the same. Netstat -ano revals its listening on 9997 and has splunkd.exe as the PID owning the port.

laurie_gellatly · ‎05-29-2014

Yep, that's a "Me too". This little gem was causing all types of slowness on the delivery of events and the unpredicatble connection of UFs. Adding SSL to the UF-HF connection seems to make it even worse. UF's complained
Connect to x.x.x.x:9997 failed. No connection could be made because the target machine actively refused it
Connection to host=x.x.x.x:9997 failed
Cooked connection to ip=x.x.x.x:9997 timed out

Thanks ...Laurie:{)

lrudolph · ‎12-11-2013

Yeah! This was finally the solution to my problem, too. Our forwarders showed a lot of "WARN TcpOutputProc - Cooked connection to ip=x.x.x.x:9997 timed out"-messages in the logs. Finally, we lost data, even with two indexers and useACK=true in place. We could trace it back to the not configured connection_host-setting of the indexers which defaulted to "dns". Since we don't use a DNS-Server in out network, the number of forwarders we deployed finally slowed everything down and finally lead to data which couldn't be indexed. connection_host = none solved it all.

Thank you!

AaronMoorcroft · ‎08-03-2015

The Connection_Host setting, where is that and in this case was it on the indexer/s or the forwarder that you changed it ?

woodcock · ‎10-25-2017

This is a setting on the indexers.

BP9906 · ‎06-06-2012

Yep, and that window is upon restart of the Splunk server (ie splunk.exe restart command). After that short window, all the forwarders stop receiving.

sowings · ‎06-06-2012

So is there some window in which telnet splunkserver 9997 does work?

BP9906 · ‎06-06-2012

Windows Firewall is allowed, especially since the agents connect after I restart the Splunk Indexer (splunk.exe restart). After 2-4 minutes of the splunk indexer restart, they disconnect, connections are refused, then after about 5 minutes, the splunk server starts accepting the tcp connection again, but no data is being received by the indexer.

kreszan · ‎01-22-2015

I have the same issue. What was your resolution ? I'm on 6.1.5 now.

sowings · ‎06-06-2012

Firewalls in play?

Splunk Forwarder Connection Refused from Splunk Indexer

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Best Practices: Splunk auto adjust pipeline queue

Laser Bananas and Edge Hubs: Exploring Operational Technology (OT) Data Through a ...

Event Series: Mastering AI Tokenomics and Splunk Agent Observability

Join the Conversation

Splunk Forwarder Connection Refused from Splunk Indexer

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Best Practices: Splunk auto adjust pipeline queue

Laser Bananas and Edge Hubs: Exploring Operational Technology (OT) Data Through a ...

Event Series: Mastering AI Tokenomics and Splunk Agent Observability