Just wondered if anyone has seen this issue in their environment?
I noticed, by chance, that our license usage was particularly low for the day so far. I soon found that most of the forwarders had stopped forwarding traffic to our Indexer.
I found the following messages in splunkd.log:
08-20-2013 04:55:38.976 +0100 INFO TcpInputProc - Stopping IPv4 port 9997
08-20-2013 04:55:38.976 +0100 WARN TcpInputProc - Stopping all listening ports. Queues blocked for more than 300 seconds
08-20-2013 04:55:41.004 +0100 INFO TcpInputProc - Starting IPv4 port 9997
08-20-2013 04:55:41.004 +0100 WARN TcpInputProc - Started listening on tcp ports. Queues unblocked
Although the messages imply Splunk started listening again 9997, it appears this is not entirely correct, as we lost a lot of of data for about 10 hours, luckily some of which can be retrieved, however some of it cannot. Restart Splunk seems to have fixed this issues, well allowing traffic on 9997 at least. Before restarting Splunk had ownership of 9997.
I have seen all connections to forwards drop a number of times for no apparent reason. Adding this to the system//inputs.conf file on the indexer seems to have fixed the problem.