I keep seeing this message in splunkd.log on my instance, what does it mean?
My instance is used primarily as a search and indexing instance, and it also distributes searches to other indexing instances.
It means that your search-head dispatched a search to the peer machines but didn't get a response from one or more peers within the specified timeout period.
Should have gotten at least 3 tokens in status line = I should have gotten a response from servers
Only got 0 = I didn't get a response
The error message is lacking in that it details a potential problem but it doesn't reveal a source machine. In general though, it's indicative of an over-worked search peer that doesn't have the cycles to spare to handle another search request, or there could be a network issue in between the 2 machines.
It's also possible that the search-head is itself oversubscribed in one way or another. For example, it is not recommended to run the Deployment Server on a search-head instance, as both this component and distributed search share the splunkd management port. Run the Deployment Server in its own dedicated instance instead.
First step to resolution will be to figure out which peer is causing the error message. Try turning your peers on one at a time to figure this out.
I am seeing the same thing...
windows 2003 server used as indexer with distributed search head running on linux.
ERROR HTTPClient - Should have gotten at least 3 tokens in status line, while getting response code. Only got 0.
ERROR TcpInputFd - SSL Error for fd from HOST:yy.xxx.55.10, IP:yy.xxx.55.10, PORT:45123
ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
ERROR TcpInputFd - SSL Error = error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
working fine with no errors when linux search head to linux indexer.
From my own experience I would rather think it means something else, or that there is a bug, or that something is slightly misconfigured.
I keep getting these messages from ALL BUT ONE of the hosts in my setup .
1 Indexer v4.2.1
1 Search Head v4.2.1
1 Heavy Forwarder v4.2.2 <-- only the 4.2.2 host is NOT generating these messages.
1 Universal Forwarder v4.2.0
50-odd Universal Forwarders v4.2.1
I do not think it is network related, since the Search Head and Indexer are on the same VLAN and neither of them are under any considerable load (~5GB of daily log).
I do not believe that it has anything to do with HOW logs are transported to the indexer. Below is the outputs.conf that is delivered to all forwarders.
[tcpout]
defaultGroup = splunkssl
disabled = false
compressed=true
[tcpout:splunkssl]
server = splunkindex.company.com:9997
[tcpout-server://splunkindex.company.com:9997]
sslCertPath = $SPLUNK_HOME/etc/apps/company-forwarding/local/company-splunk-forwarder.pem
sslCommonNameToCheck = company-splunk-indexer.company.com
sslPassword = XXXXXXXXX
sslRootCAPath = $SPLUNK_HOME/etc/apps/company-forwarding/local/company-ca.pem
sslVerifyServerCert = true
I would rather guess that this is along the lines of the bogus(?) error message:
Error encountered for connection from src=10.XXX.XXX.XXX:65278. Success
That error also came in from all forwarders until we upgraded from 4.2.0 to 4.2.1. So currently there is one UF (4.2.0) still sending these latter error messages, but the logs keep coming in anyway.
If anyone knows a definite answer it'd be good to know, but I'm not too worried as everything seems to be working fine.
Kristian
I see these errors from a Windows Server 2003 machine and my log machine is way under utilized?
It means that your search-head dispatched a search to the peer machines but didn't get a response from one or more peers within the specified timeout period.
Should have gotten at least 3 tokens in status line = I should have gotten a response from servers
Only got 0 = I didn't get a response
The error message is lacking in that it details a potential problem but it doesn't reveal a source machine. In general though, it's indicative of an over-worked search peer that doesn't have the cycles to spare to handle another search request, or there could be a network issue in between the 2 machines.
It's also possible that the search-head is itself oversubscribed in one way or another. For example, it is not recommended to run the Deployment Server on a search-head instance, as both this component and distributed search share the splunkd management port. Run the Deployment Server in its own dedicated instance instead.
First step to resolution will be to figure out which peer is causing the error message. Try turning your peers on one at a time to figure this out.
the known issue n8 mentions is this one:
Deployment server: 'splunk reload deploy-server' command causes Linux host to freeze. (SPL-62493)
...restart splunkd on the deployment server
Restart WHICH splunk process?
The Deployment server or each one of the clients trying to talk to the Deployment server?
There is a known issue with deployment server in 5.0.2. You will see errors like below on your clients if this issue is affecting you.
WARN NetUtils - Bad select_for_loop rv = -2
ERROR HTTPClient - Should have gotten at least 3 tokens in status line, while getting response code. Only got 0.
DeploymentClient - Unable to send phonehome message to deployment server. Error status is: not_connected
This can be caused by disabling or restarting deploymentserver (i.e splunk disable deploy-server or .../en-US/debug/refresh?entity=admin/deploymentserver).
Restart splunk until a patch is released.
I don't have an answer to this, but actually want to re-open this thread as it doesn't explain what is going on with my system. I have 5.0.2 indexers (5) and search heads (2) controlled by a deployment server VM and then a bunch of 4.2.3 light forwarders that are ALL getting this error repeatedly. This is a new system and is very un-loaded, so that explanation doesnt seem correct to me. I would like to know more what exactly the message means. I get that its "didnt get a response" but its not giving any indication as to who the client was. And that it is coming in on ALL the systems logs (indexers, search heads, deploy mgr, forwarders alike) is concerning.