Getting Data In

How to troubleshoot error "forwarding to indexer group default-autolb group blocked for N seconds"?

msantich
Path Finder

hello

We have a Linux server running Splunk forwarder which forwards to one of two heavy forwarders in an autolb configuration.
The Splunk forwarder reports that it connects to the heavy forwarder, but I get a message in splunkd.log that says

forwarding to indexer group default-autolb group blocked for <nnnnn> seconds. 

From the point of view of the deployment monitor running on the indexer, the Splunk forwarder in question is "missing".

Please help us diagnose our problem as we have a demo to a customer tomorrow.

thank you

0 Karma
1 Solution

msantich
Path Finder

Jkat54 - thanks for your response - here is some more data

I'm seeing the light forwarders connecting on/off to the heavy fwds, but the connections keep dropping

On light forwards, I'm getting errors like :

read operation timed out expecting ack form

Possible duplication of events with channel=source ...offset = on host Raw connection to timed out Forwarding blocked...
Applying quarantine to
Removing quarantine from

On heavy fwds, I get erros like :
Forwarding to blocked

From the point of view of the deployment monitor, all the light fowrders in the system keep toggling between active and missing....
if on the light forwarders I do: ./splunk list forward-server, I do not get consistent results...

we're using ssl...netstat reports connections on port 8081 (used from light fwds to heavyfwds) and 8082 (heavy fwds to indexer)

Thanks..

Michael.

0 Karma

msantich
Path Finder

We can close this. Of the many servers (splunk light forwarders) that were failing to report, I rebooted one of the ones that was reporting all the forwarding blocked error messages. Within 2 minutes the other servers began reporting in and within 15 minutes, all 34 servers in the domain had successfully reported and forwarded a days' worth of data to the heavy forwarders.

Though the issue is fixed, I'd like to know if there is something that we did or something in our config to cause this to happen. Is there a tuning param set too tight, for example.

thank again to Jkat54

Thanks for any feedback you can give here.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Nothing strikes me as being 'the problem'. Believe it or not, restarting To fix the problem works fairly often.

In your case i would set up an alert to monitor your _internal index and alert if the condition occurs again. At least you know the fix next time it happens. If it did continue to happen I would continue digging with a support ticket, etc.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...