Splunk crashes frequently (not always) when scanned by Retina vulnerability scanning tool (http://www.eeye.com/products/retina/retina-network-scanner).
Something to do with the SSL handshake attempts that Retina is doing with the splunk service running on TCP/8080.
All splunk forwarders reported the SSL errors in splunkd.log, but not all crashed, but several did. This sounds like Retina can crash splunk, but not always.
Looking at splunkd.log just prior to crashing (the next log entries after these are related to splunk restarting):
08-29-2012 09:07:32.074 -0400 ERROR TcpInputFd - SSL Error for fd from HOST:172.28.41.199, IP:172.28.41.199, PORT:53924
08-29-2012 09:07:32.075 -0400 ERROR TcpInputFd - SSL_ERROR_SYSCALL ret errno:54
08-29-2012 09:07:32.075 -0400 ERROR TcpInputFd - SSL Error = error:00000000:lib(0):func(0):reason(0)
08-29-2012 09:07:32.075 -0400 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
08-29-2012 09:07:32.075 -0400 ERROR TcpInputFd - SSL Error for fd from HOST:172.28.41.199, IP:172.28.41.199, PORT:53925
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - SSL_ERROR_SYSCALL ret errno:54
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - SSL Error = error:00000000:lib(0):func(0):reason(0)
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - SSL Error for fd from HOST:172.28.41.199, IP:172.28.41.199, PORT:53931
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - SSL_ERROR_SYSCALL ret errno:54
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - SSL Error = error:00000000:lib(0):func(0):reason(0)
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - ACCEPT_RESULT=-1 VERIFY_RESULT=0
08-29-2012 09:07:32.077 -0400 ERROR TcpInputFd - SSL Error for fd from HOST:172.28.41.199, IP:172.28.41.199, PORT:53930
Splunk engineer has identified this to be a defect (SPL-55686), and they expect to have it fixed. Targeted on release 4.3.5 or later.
Looks like one or more of the firewall rules sent by Retina to the forwarder's splunkd (listening on port 8089) has triggered it to crash. This seems like a potential bug with the forwarder. The splunkd process should not be crashing because it has received an offending firewall rule front the scanner.
If you disable the forwarder's default management port (8089), the issue should go away. Universal forwarders do not necessary need to have the management port open to be functional.
Referred to this link to diable the management port:
This happens on universal forwarders for Linux and OS X.
On what OS and with which version was it happening ?