Had myself a little denial of service today. Ran a Nessus scan for the first time on our main Splunk indexer/web interface. The scan caused Splunkweb to shut down...
2010-05-13 15:52:16,301 ERROR [4be85e4ab8125c290] root:120 - ENGINE: Error in HTTP server: shutting down Traceback (most recent call last): File "/splunk/app/splunk/lib/python2.6/site-packages/cherrypy/process/servers.py", line 73, in _start_http_thread File "/splunk/app/splunk/lib/python2.6/site-packages/cherrypy/wsgiserver/__init__.py", line 1662, in start self.tick() File "/splunk/app/splunk/lib/python2.6/site-packages/cherrypy/wsgiserver/__init__.py", line 1717, in ticks, addr = self.socket.accept() File "/splunk/app/splunk/lib/python2.6/ssl.py", line 317, in accept newsock, addr = socket.accept(self) File "/splunk/app/splunk/lib/python2.6/socket.py", line 195, in accept error: [Errno 24] Too many open files
Something I should tune with Nessus to scale back the requests?
I run nessus daily on the splunk server without any issues. Perhaps you have a very aggressive profile? Can you share that?
My scan was really nothing special- one that I've run on dozens of other servers. Safe checks, moderate simultaneous threads, etc. I'm going to give it another go later today to see what happens.
Yeah, looks like it may be a Solaris x64 issue. I have a Splunk engineer researching for me (my Solaris-fu is not strong).
Solaris has an nofiles ulimit of 256 by default. Other unixes have larger defaults.
There's a variety of possible responses we could have to this sort of thing:
Does this cure seem better than the poison? It introduces new problems.
It looks like this may have been fixed in 4.1.4. Have you tried it?