Solved: ERROR TcpInputProc - Error encountered for connect...

nebel · ‎03-20-2012

Hi there,

since we rolled out a couple of houndred forwarder, we do have connection errors.

If I do a telnet from a forwarder (unix), sometimes I get an answer, sometimes I doesn't. If it works, we get events.

On the indexer I can recognize this error event

ERROR TcpInputProc - Error encountered for connection from ... timeout

I have a lot of them...
The forwarders and indexer are in the same subnet. We already installed a new one to verifiy if we have an issue in our configuration. With a new indexer we have the same issue.

On the forwarder side we have the following warn message

TcpOutputProc - Raw connection to ip ... :9997 timed out

Does anyone have had the same issue?

thanks in advance

Regards.

nebel · ‎06-08-2012

this error is caused by the heartbeat function. every 30 seconds the heartbeat will send to indexer. if the indexer don't get it during that time, the indexer writes a log with the timeout message. network devices like a firewall can causing this or long remote connections. I disabled the heartbeat. Other solution could be change the time frequency from 30 seconds...

View solution in original post

tjrhodeback · ‎05-01-2017

Are all of your SE servers using NTP and do you have the correct DNS records loaded? Timing and authentication can cause issues on you Splunk infrastructure.

nebel · ‎06-08-2012

this error is caused by the heartbeat function. every 30 seconds the heartbeat will send to indexer. if the indexer don't get it during that time, the indexer writes a log with the timeout message. network devices like a firewall can causing this or long remote connections. I disabled the heartbeat. Other solution could be change the time frequency from 30 seconds...

kristian_kolb · ‎03-20-2012

It could be that you are overloading your network a/o indexer.

Did the problem always exist, or did it start occurring once you reached a certain number of forwarders sending data?

Have you installed the Deployment Monitor app? It ships with splunk by default, you just need to enable it. This can give you some insights into congestion problems.

Please tell us more of your HW/SW configuration (OS, version of splunk etc etc)

UPDATE:

Does the error occur for a particular type of forwarder?
Are your ulimit and other OS settings (forwarder and indexer) the same as for the other (functioning) landscape?
Are there intermediate network components that might be causing trouble (switches, routers, firewalls)?
Does the problem go away when you have lower loads (e.g. at night)?

/K

itmonitoring · ‎08-06-2013

Hi there, the error seems to have disappeared when I moved from a Universal Forwarder configuration to a Light Forwarder.

I was not able to get data into my indexers, but I'm not sure if this error had anything to do with it.

The error was appearing with as few as 4 hosts, so I don't think it's related to a network load issue.

nebel · ‎04-17-2012

Just for your information: at the moment it seems like an normal behaivor. We think that this "error messages" don't influences the Splunk indexing behaivor.

nebel · ‎04-02-2012

@JasonCzerak: Did you find the solution or any hints for that?

JasonCzerak · ‎03-21-2012

I have the same problem. The forwarders are on the same subnet as the intermediate forwarder. With just as little as 10 connections to it would error out.

Drainy · ‎03-21-2012

Also, to reply to this thread all you need to do is to click "comment on this answer" below this message, saves me converting your answers to comments 😉

kristian_kolb · ‎03-21-2012

There are several tools for this, depending on your OS, but common ones include WireShark or tcpdump. /k

nebel · ‎03-21-2012

thanks. How does it works, the packet caputure on an indexer?

Drainy · ‎03-21-2012

This sounds alot like a firewalling or a stateful issue. Do you have a firewall between the machines? I've seen this previously where a firewall has decided that either it doesn't allow a tcp connection or it times one out too quickly or decides it has been open too long. Perhaps it would be useful to do a packet capture on the indexer?

nebel · ‎03-21-2012

Thanks for response.

Yes, if we have less connections, we don't have trouble. It just came up with more forwarder connections.

We installed a new Indexer with different hardware (other switchports, other layer3 components).

We don't have this issue an all forwarder, just a couple of houndreds which are located in different subnets.

It is definitly an issue with the three way handshake (doens't complete successfull). Means the TCP connection between Forwarder <-> Indexer work properly. All firewall logs are checked, no noticeable events.

We opened an Splunk support ticket today.

kristian_kolb · ‎03-20-2012

update with further questions above. /k

nebel · ‎03-20-2012

Thanks for your response. The indexer doesn't have the status "overloaded".
Befor we rolled out the new forwarders (~1000), we had a couple of houndreds without this errors.
All queues are fine.
We already checked S.o.S and Deployment Monitor without any helpfull message. The only message I got is what I pasted before.

The indexer is a powerfull Quadcore machine with 16 GB of RAM. The indexes are located on a Netapp. The Splunk version is 4.3.1 and also the forwarders.

We already tried the same scenario with 4.3., same behaivor.

At the moment the network team is checking all points.

We do have exactly the same configurations in other landscapes (HW/SW) without any problems. And in other landscapes we have a lot more forwarders.

ERROR TcpInputProc - Error encountered for connection from timeout

Brains, Bytes, and Boston: Learn from the Best at .conf25

Splunk AppDynamics Agents Webinar Series

SplunkTrust Application Period is Officially OPEN!

Are you a member of the Splunk Community?

ERROR TcpInputProc - Error encountered for connection from timeout

Brains, Bytes, and Boston: Learn from the Best at .conf25

Splunk AppDynamics Agents Webinar Series

SplunkTrust Application Period is Officially OPEN!