Hey guys,
Doesn't seem like many people have had problems with the Splunk Web Service hanging on them, but this is somewhat similar to:
Basically, I have been testing my Foundstone Vulnerability Scanner on my splunk indexers and search head and it looks like at least one of servers has its Splunk Web Service die. Port is reachable via telnet, and splunk status states that the service is up, but the site itself is unreachable.
I simulated the situation and tailed the logs and see these types of errors:
06-22-2010 20:42:42.816 ERROR NetUtils - SSL_ERROR_SSL in SSL_write. nbytes=-1, Error = error:140D00CF:SSL routines:SSL_write:protocol is shutdown
06-22-2010 20:42:43.210 ERROR TcpInputFd - SSL Error = error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
06-22-2010 20:42:43.210 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
06-22-2010 20:42:43.210 ERROR TcpInputFd - SSL Error for fd from HOST:$foundstone_scanner$, IP:$foundstone_IP$, PORT:5927
06-22-2010 20:42:43.365 ERROR TcpInputFd - SSL Error = error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
06-22-2010 20:42:43.365 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
06-22-2010 20:42:43.365 ERROR TcpInputFd - SSL Error for fd from HOST:$foundstone_scanner$, IP:$foundstone_IP$, PORT:5939
06-22-2010 20:42:44.116 ERROR TcpInputFd - SSL Error = error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request
06-22-2010 20:42:44.116 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
06-22-2010 20:45:11.376 INFO TailingProcessor - File descriptor cache is full (64), trimming...
06-22-2010 20:45:17.397 INFO TailingProcessor - File descriptor cache is full (64), trimming...
06-22-2010 20:45:19.516 INFO TailingProcessor - File descriptor cache is full (64), trimming...
06-22-2010 20:46:22.830 INFO TailingProcessor - File descriptor cache is full (64), trimming...
This is the error message that appears right before it hangs:
2010-06-18 19:05:05,842 ERROR [4c1798223163010d0] root:120 - ENGINE: Error in HTTP server: shutting down
Traceback (most recent call last):
File "/opt/splunk/lib/python2.6/site-packages/cherrypy/process/servers.py", line 73, in _start_http_thread
File "/opt/splunk/lib/python2.6/site-packages/cherrypy/wsgiserver/init.py", line 1662, in start
self.tick()
File "/opt/splunk/lib/python2.6/site-packages/cherrypy/wsgiserver/init.py", line 717, in tick
s, addr = self.socket.accept()
File "/opt/splunk/lib/python2.6/ssl.py", line 317, in accept
newsock, addr = socket.accept(self)
File "/opt/splunk/lib/python2.6/socket.py", line 195, in accept
error: [Errno 24] Too many open files
Anyone ever have the same issue?
I also ran splunk diag so that I can open support ticket but just wanted to see if anyone else has ever had this problem.
Let me know.
Thanks Guys!
Brian
Can you check 2 things for me:
1) does a ./splunk restart splunkweb fix the hang? (or does it time out) 2) Can you run searches in CLI mode?
I have definitely seen this issue come up, but am unsure if it is exactly the same issue or not, however, if the two above happen then we already have a fix for it. Contacting support would probably be the best way for you to get this issue troubleshooted further, and receive the new patch.
Let me know on the above two conditions...
Cheers,
.gz
Filed support ticket with Splunk Support and Patch was applied to remedy the issue. If anyone has a similar issue you should file support case and have the symptoms diagnosed by the Splunk Engineers.
Thanks.
Brian
Mine only appears during vulnerability scanning.
Filed support ticket with Splunk Support and Patch was applied to remedy the issue. If anyone has a similar issue you should file support case and have the symptoms diagnosed by the Splunk Engineers.
Thanks.
Brian
Can you check 2 things for me:
1) does a ./splunk restart splunkweb fix the hang? (or does it time out) 2) Can you run searches in CLI mode?
I have definitely seen this issue come up, but am unsure if it is exactly the same issue or not, however, if the two above happen then we already have a fix for it. Contacting support would probably be the best way for you to get this issue troubleshooted further, and receive the new patch.
Let me know on the above two conditions...
Cheers,
.gz
As a short-term remedy... I set up CRON to restart service once a day...
restarting web service fixes the problem...
Never tried searches in CLI mode since the service restart fixes the issue.... I can try and get back to you on that...
Thanks
Brian