Monitoring Splunk
Highlighted

internal splunk sockets stuck in CLOSE_WAIT

Engager

Why does splunkd close internal sockets before the receive queue has been emptied? This appears to leave them laying around in CLOSE_WAIT state instead of moving directly to CLOSE.

$ netstat -tap |awk 'NR<3||/:8089/'
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 example.com:8089            *:*                         LISTEN      14836/splunkd
tcp       38      0 example.com:54043           example.com:8089            CLOSE_WAIT  14927/python
tcp       38      0 example.com:54092           example.com:8089            CLOSE_WAIT  14927/python
tcp       38      0 example.com:54097           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54106           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54041           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54019           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54042           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54044           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53727           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54073           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53499           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54103           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53495           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53959           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54098           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53257           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54094           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54102           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53498           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53496           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54016           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54017           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54096           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53730           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54108           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54104           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53729           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53084           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54095           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53258           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54090           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53255           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53728           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53726           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53962           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54039           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54034           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53964           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54101           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54099           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53961           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54107           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54105           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53497           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53492           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53963           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53254           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54018           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54087           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:53256           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54093           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54091           example.com:8089            CLOSE_WAIT  14927/python        
tcp       38      0 example.com:54100           example.com:8089            CLOSE_WAIT  14927/python        
Tags (3)
Highlighted

Re: internal splunk sockets stuck in CLOSE_WAIT

Splunk Employee
Splunk Employee

I believe CLOSE_WAIT can actually be an optimization technique -- the client's able to start sending data again without going through the full process of establishing the connection. However, I'm not an expert at this, so just thought I'd mention it as a possibility, and leave the real answers to someone who knows for sure. Good question though.

0 Karma
Highlighted

Re: internal splunk sockets stuck in CLOSE_WAIT

Champion

According to http://blog.olivierlanglois.net/index.php/2008/06/05/close_wait_vs_time_wait:

"A TCP connection goes into the CLOSE_WAIT state when it receives a FIN segment from its peer. From that point the connection becomes half-duplex and the TCP connection will not receive any new data from its peer ... the socket will stay there as long as the server does not call close() explicitly on the socket."

Therefore, this does look like the socket is not being closed rather than an optimization technique.

0 Karma
Highlighted

Re: internal splunk sockets stuck in CLOSE_WAIT

I have the same problem, except i've got exactly 469 bytes left in the Recv-Q and thousands of leaked connections.

This has completely killed our installation of splunk (and our enthusiasm for the product.) Support wasn't able to help us and we've basically given up. 😞 If anyone finds a solution please post it here.

View solution in original post