Ran into something today that I can't seem to find much information on. I've got a single light forwarder instance that has stopped forwarding data. Checking metrics.log shows the following queueus are blocked:
07-06-2012 09:24:23.412 -0500 INFO Metrics - group=queue, name=aggqueue, blocked=true, max_size_kb=1024, current_size_kb=1023, current_size=2635, largest_size=2635, smallest_size=2635
07-06-2012 09:24:23.412 -0500 INFO Metrics - group=queue, name=indexqueue, blocked=true, max_size_kb=500, current_size_kb=499, current_size=1283, largest_size=1283, smallest_size=1283
07-06-2012 09:24:23.412 -0500 INFO Metrics - group=queue, name=typingqueue, blocked=true, max_size_kb=500, current_size_kb=499, current_size=1235, largest_size=1235, smallest_size=1235
I have several other identical forwarders that are not seeing these issues, and the indexer is showing no signs of strain. There aren't any connectivity issues between the forwarder and the indexer, nor any recent config changes. All the info I've found here regarding queue blockages at the indexer, so I get the impression this isn't a common issue.
Any ideas on what to look for here? I'm at a loss as to why this particular instance is failing.
Thanks!
I solved this by uninstalling and reinstalling the light forwarder. This didn't really shed any light on what the issue was, but it is functional now.
I solved this by uninstalling and reinstalling the light forwarder. This didn't really shed any light on what the issue was, but it is functional now.
I might start by checking the thruput
settings for that forwarder. Light and Universal forwarders have a bandwidth throttle between them and the indexers. The relevant setting is in limits.conf.
http://docs.splunk.com/Documentation/Splunk/latest/Admin/Limitsconf
Doesn't look like that's the issue. That appears to limit the rate of event processing on the indexer side of things. Even if it could be a cause, it's defaulted to no limit.