Getting Data In

cooked connection timed out?

suhprano
Path Finder

I see some of these time outs in the /var/log/splunk/splunk.log
Is this something I should be concerned about? Does the forwarder try a resend? Is this a potential data loss? or if there's a retry, does it handle the resend gracefully?

01-13-2012 01:52:56.098 +0000 INFO  TcpOutputProc - Connected to idx=x.x.x.x:9997
01-13-2012 01:54:15.760 +0000 WARN  TcpOutputProc - Cooked connection to ip=x.x.x.x:9997 timed out
01-13-2012 01:54:15.762 +0000 INFO  TcpOutputProc - Connected to idx=x.x.x.x:9997
01-13-2012 01:54:45.592 +0000 WARN  TcpOutputProc - Cooked connection to ip=x.x.x.x:9997 timed out
01-13-2012 01:55:15.423 +0000 WARN  TcpOutputProc - Cooked connection to ip=x.x.x.x:9997 timed out
01-13-2012 01:55:15.424 +0000 INFO  TcpOutputProc - Connected to idx=x.x.x.x:9997
01-13-2012 01:55:34.333 +0000 INFO  TcpOutputProc - Connected to idx=x.x.x.x:9997

Thanks!

Tags (3)

mookiie2005
Communicator

I appreciate everyone who contirbuted to this issue. What I ended up finding/figuring out is that the above errors were actually a symptom of a much larger issue. When I could not find any reason log entries were not being sent over, I checked the internal logs and found that some in fact did get over into the splunk instance. This istance had been reporting being near or at the maximum concurrent searches. Once I looked at the logs and combined this issue with users reporting that logs were not getting indexed for display on their dashboards, but when I would look into it the data was there. Eventually I decided to take a look at teh indexer and that is when I found that all teh queues, indexing, typing, parsing, and merging were all at or nearly constantly at 100%. Once I saw that everything started to make sense, the indexer is backed up and cannot keep up with volume, at this point I am assuming an issue with IOPS as the hardware is well within the splunk specifications, except for disk, as Our group was not involved/consulted on the install and all we have been told is that it is tier 1 storage, but at the moment we have no idea how many IOPS that offers. That is what we are investigating now.

0 Karma

risgupta_splunk
Splunk Employee
Splunk Employee

Has it ever been able to send logs to your indexers? If not, verify that your universal fowarder can:

 telnet splunkservername 9997

If that is successful, rule out slow/problematic dns resolutions by configuring your hosts file to map that server name to its IP. Report back after ruling these out. If you try both of these and there are still issues, chances are the problem is on the indexer and not your client sending data. You can also try setting connection_host=false in your inputs.conf for the 9997 stanza, making sure the indexer isn't trying to resolve the name of the forwarder.

0 Karma

bnorthway
Path Finder

I had configured the input in $SPLUNK_HOME/etc/system/local/inputs.conf. That is incorrect. I deleted that input stanza, and re-added the input through the GUI, and the new stanza was created in $SPLUNK_HOME/etc/apps/search/local/inputs.conf. All is good again!

0 Karma

dfredell
Explorer

I solved this issue by editing etc/system/local/inputs.conf on the receiving/ indexing server.
As noted under Set up receiving with the configuration file all I had to do was add:

[splunktcp://9997]

disabled = 0

These are the basic commands I ran on the forwarder linux server:

bin/splunk add monitor /logfile-gc.log -sourcetype gc

bin/splunk add forward-server 192.168.0.1:9997

nurtdi
Path Finder

We now manage our splunk distributed search infrastructure with puppet.
To solve the problem I had deleted all search peers from each search head and then re-added them on each.

Here is a snippet from the script to run on search head:

search_peer_list="splunkidx1 splunkidx2"

splunk_pswd="splunkadminpassword"

for indexer_name in ${search_peer_list}

do
${splunk_home}/bin/splunk add search-server -host ${indexer_name}:8089 -auth admin:${splunk_pswd} -remoteUsername admin -remotePassword ${splunk_pswd}

done

0 Karma

ayushmaan
Explorer

@nurtdi Can you specify what is an indexer cert. How to check its validity and details?

0 Karma

nurtdi
Path Finder
0 Karma

nurtdi
Path Finder

you can check cert validity and details using openssl:
openssl x509 -enddate -noout -in your_cert.pem
or
echo | openssl s_client -showcerts -connect your_splunk_server:port

0 Karma

jgoldberg_splun
Splunk Employee
Splunk Employee

A customer I was working with ran into the same issue. The issue was somehow they ended up with two outputs.conf files (paths below) on the Universal Forwarder with 2 diff indexer IPs being pointed to. The first file below had the correct IP of the indexer; the second file did not - the IP went nowhere. Since the second file has priority, the UF only started working once they pointed this second file to the correct IP.

C:\Program Files\SplunkUniversalForwarder\etc\apps\SplunkUniversalForwarder\local
C:\Program Files\SplunkUniversalForwarder\etc\system\local

BTW, not sure how they ended up with the 2 diff IPs. Also best practice of course is just have the IP of the indexer in one place to make it management easier.

0 Karma

Ayn
Legend

If the issue is persistent, I suspect your forwarder is configured to setup an unencrypted connection to the indexer but the indexer only accepts encrypted connections - or vice versa.

mookiie2005
Communicator

@rgcurry we are having the same problem between an indexer and the search head. Were you able to resolve your issue? Please share if you found a solution.

0 Karma

mookiie2005
Communicator

@nurtdi we are having this same issue between one of our indexers and search head. Could you give me more detail on how you resolved your issue? Thanks!

0 Karma

MuS
SplunkTrust
SplunkTrust

thanks Ayn, you saved my day once again 😉

0 Karma

rgcurry
Contributor

I just setup my new implementation and I am getting this error message intermittently for one of the three indexers, which is randomly "picked" to report this error. Forwarders and Indexers are otherwise communicating properly. What else might I look for as possible cause for this situation?

0 Karma

Ayn
Legend

Excellent! Please mark some answer here as accepted, it shows that the "case is closed" so to speak 🙂

0 Karma

anderius
Explorer

It would be better to post the correct answer as a new answer, and then mark it...

0 Karma

nurtdi
Path Finder

found the issue - missing indexer cert
I wish it was easier to find in the forwarder log why connection was timing out...

closing the case. thank you!

0 Karma

anderius
Explorer

I do not understand how this solved the issue. I have the issue sometimes (as seems to be the case in your question), that is, not always. Simply a misisng certificate would mean the problem should always happen, right?

nurtdi
Path Finder

Thank you for your response, Ayn. I have both forwarder and receiver to use ssl... and I have done it many times... not sure what is different this time.

I have case opened, will see:
Description: WARN TcpOutputProc - Cooked connection to ip=xx.xx.xx.xx:9992 timed out

forwarder:
-bash-3.2# cat outputs.conf
[tcpout]
defaultGroup = splunkssl-LB

[tcpout:splunkssl-LB]
server = splunk06:9992
compressed = true

[tcpout-server://splunk06:9992]
sslCertPath = $SPLUNK_HOME/etc/certs/forwarder.pem
sslCommonNameToCheck = indexer
sslPassword = xxxxxxxxxxxxxxxx
sslRootCAPath = $SPLUNK_HOME/etc/certs/cacert.pem
sslVerifyServerCert = true

receiver:
# HOST
[default]
host = splunk06

[SSL]
password = xxxxxxxxxxxxxxxxxxx
requireClientCert = true
rootCA = $SPLUNK_HOME/etc/certs/cacert.pem
serverCert = $SPLUNK_HOME/etc/certs/indexer.pem

[splunktcp-ssl://9992]
compressed = true

0 Karma

nurtdi
Path Finder

having the same issue, data is not being forwarded. did you find what was the issue?

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!