Getting Data In

How to fix error "Forwarding to indexer group default-autolb-group blocked for 100 seconds"?

Yaichael
Communicator

How do I solve this issue through Splunk Web?

Forwarding to indexer group default-autolb-group blocked for 100 seconds
1 Solution

lguinn2
Legend

When I have seen this problem, it is often: Someone accidentally put outputs.conf on the indexer. Now the indexer is trying to forward to itself, in a very nasty loop. So, on the indexer, look for outputs.conf in a local directory, either under $SPLUNK_HOME/etc/system or under $SPLUNK_HOME/etc/apps somewhere. If you find it, you can probably simply remove it. Restart Splunk to make the change take effect.

View solution in original post

alohrer
New Member

In my case the issue was that the disk partition allocated for splunk was full. I could detect it by "btrfs fi show" which has indicated "size 1024.00GiB used 1024.00GiB". Usually it should be "size 1024.00GiB used ~500.00GiB" where the second value is less than the first. I solved it by rebalancing with "btrfs balance start -dusage=30".

0 Karma

sylim_splunk
Splunk Employee
Splunk Employee

If your forwarders are struggling to send data in a timely manner and if it just started to happen around the times of below you can look into it further,

i) New inputs added
- Have you recently added new inputs that could have overloaded the indexers?
If this is the case, try to disable them to see if it improves the situation and then look into it further why they caused this.

  • Do you find many log messages from the category, DateParserVerbose or LineBreakingProcessor? If you find these log messages complaining about invalid timestamp or linebreaks then it is something to do with the input and props configurations causing Splunk to struggle to process the data. You will need to correct the config for the inputs first. http://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition

At any moment if you see the messages from these categories it means wrong configs make it hard for Splunk to process data appropriately.

ii) New searches, check any expensive searches running around the time of the messages being put into splunkd.log.
If you see this issue soon after adding new searches, either by new apps or by any splunk users, try to find any expensive searches from Monitoring Console. This could happen when theses searches block indexers from accessing disk in a timely manner.
Or also you need to make sure the performance of disk subsystem, check IOPS to see if it meets the recommended performance;
http://docs.splunk.com/Documentation/Splunk/6.0/Installation/Referencehardware

iii) Slow in processing index data
This is similar to what I mentioned above. You also need to check the indexer queue status to see where the queue blocking started from.
- If it's from indexing queue it could be due to the load on disk
- If it's from typing queue it could be due to some expensive regex issues
- If it's from aggQueue then it could be due to time stamp recognition or line breaking issue for some inputs

iv) Have you had this kind of situation for a long time?
- Then it would have been caused by the mix of above..

Please check the above and if it still persists or you think you have failed to locate the cause please contact Splunk Support and provide the below;
- Splunk Deployment architecture
- Diags from your indexers
- Time of incident so that I know where to look into in log files.

prpohar
Engager

Hi sylim,

How to perform the checks that you have mentioned in bullet point number "iii) Slowdown in processing" index data

0 Karma

sylim_splunk
Splunk Employee
Splunk Employee

You can use monitoring console for that, https://docs.splunk.com/Documentation/Splunk/6.6.3/DMC/WhatcanDMCdo
Check indexing performance.

0 Karma

shan_santosh
Explorer

Same problem happened to me because of the instruction to create 'send to indexer" app on indexer while deploying Splunk app for windows infrastructure.

0 Karma

lguinn2
Legend

When I have seen this problem, it is often: Someone accidentally put outputs.conf on the indexer. Now the indexer is trying to forward to itself, in a very nasty loop. So, on the indexer, look for outputs.conf in a local directory, either under $SPLUNK_HOME/etc/system or under $SPLUNK_HOME/etc/apps somewhere. If you find it, you can probably simply remove it. Restart Splunk to make the change take effect.

View solution in original post

jensonthottian
Contributor

This could be possible due to many reason, and some of them are:
1. The indexer cluster is down
2. Forwarder is not able to connect to the Indexer (network issues)
3. Indexer port is queued

Do you see this time increasing - like in your sample log it says blocked for 100 sec, does it increment further to 200, 300 ...

First try doing a telnet from your forwarder to the indexer : telnet indexerIP 9997

Nadhiyag
Explorer

Hi i am facing the same issue .

When i do telnet its showng the connection but not forwarding the data.

Below is my error

The TCP output processor has paused the data flow. Forwarding to output group default-autolb-group has been blocked for 6200 seconds. This will probably stall the data flow towards indexing and other network outputs. Review the receiving system's health in the Splunk Monitoring Console. It is probably not accepting data.

Yaichael
Communicator

Thanks for the reply.

No, the time isn't increasing. It stays on 100 sec.

I tried telnet and it is telling me that the connection failed.

0 Karma

jensonthottian
Contributor

telnet fails which means either of the below :
1. Indexer port 9997 is not open
2. Splunkd service at the indexer is down

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!