We started to lose accessing Splunkweb running on Windows 2k8 Server. When we checked status of the service. We've noticed that the splunkweb service went down. So, I restarted splunkd and splunkweb. But, soon after splunkweb stoped. I tried it from DOS CLI. The output says that splunkd and splunkweb started successfully. But, Splunk status shows splunkweb is not running. I took a look at splunkd.log. There were a lot of following errors;
02-22-2012 19:47:42.595 -0600 WARN NetUtils - Error connecting - winsock error 10055
What is going on?
This error usually occurs when your server is a deployment-client and his own deployment-server
example in splunkd.log for windows
07-03-2012 07:04:49.573 -0500 WARN NetUtils - Error connecting - winsock error 10055
Being his own deployment server is not supported. Please avoid that. (remove deploymentclient.conf or use a dedicated deployment-server)
see remark there http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Configuredeploymentclients
You can also check the current network connections with "netstat" and "netstat -a". Maybe you can detect something unusal.
Good point.
For non-Splunk related application, definitely "netstat -an" and check which other apps are using sockets, and narrow down to applications which is possibly causing the issue.
Unfortunately, in my a few experiences of Splunk cases, they did not show so many network tcp or udp sessions as output of netstat -an. So, that output itself did not help much to identify a cause of the issue. At least you can check which ports are being used.
According to Microsoft, winsocket error 10055 means that your socket or pipe is failing due to lack of buffer.
When this happens, not only Splunk ports but also other general ports cannot be accessed. Telnet to any port would fail most likely. So, Splunk might not be causing the issue.
You have to reboot your machine to get rid of the error. But if you cannot identify a cause of the issue, eventually the same error might come back.
For Splunk, you should check your splunkd.log and see any errors related to network; such as Deployment Server, Deployment Client, TcpOutputFd, TcpOutputProc, TcpInputFd, TcpInputProc etc. Then, check if configuration file is okay, and network issue such as slow Reverse DNS or DNS lookup.
The last time we solved the issue Deployment Server and Deployment Client were running on the same Splunk instance. But, the configuration was not proper. As a result, there were a lot of Deployment Client connection errors.