Receiving blocked=true while syslog/heavy forwarde...

sureshkumaar · ‎04-21-2025

Hi All,

I have 4 Heavy forwarder servers sending data through 5 indexers

server1 acts as syslog server which has autoLBFrequency as 10 and maxQueueSize as 1000MB

server2 acts as syslog and heavy forwarder which has autoLBFrequency as 10 and maxQueueSize as 500MB

server3 acts heavy forwarder which has autoLBFrequency as 10 and maxQueueSize as 500MB

server4 acts heavy forwarder which has autoLBFrequency as 10 and maxQueueSize as 500MB

Receiving blocked=true in metrics.log while syslog/heavy forwarder trying to send data through indexer servers. Due to this index ingestion is getting delayed and data is coming to Splunk 2-3 hours late.

And in one of the 5 indexer servers CPU is always highly utilized from 99-100% consistently which has 24 CPU, other indexer servers also running with 24 CPU.

Planning to upgrade highly utilized indexer server alone from 24 to 32

Kindly suggest by updating below in outputs.conf will reduce/stop the "blocked=true" in metrics.log and CPU load on indexer will be normal before upgrading the CPU.

OR we need to do both, changes in outputs.conf and upgrading the CPU. If both can be done which is the first we can try. Kindly help.

autoLBFrequency = 5
maxQueueSize = 1000MB
aggQueueSize = 7000
outputQueueSize = 7000

sureshkumaar · ‎04-23-2025

as per the monitoring console could see indexing queue and splunktcpin queue is high

livehybrid · ‎04-21-2025

Hi @sureshkumaar

Further to my last reply - there are also a couple of worthwhile resources here which give an overview of how to identify and deal with blocked queues.

https://docs.splunk.com/Documentation/Splunk/8.2.4/Deploy/Datapipeline

How to Troubleshoot Blocked Ingestion Pipeline Queues with Indexers and Forwarders - https://conf.sp...

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

sureshkumaar · ‎04-21-2025

Thanks @livehybrid for your inputs, I did checked the blocked=true for all the 4 heavy forwarder and could see in one of the heavy forwarder which acts as syslog server collecting the network related data where the typing queue is found which is considered as a bottleneck as i went through the PDF.

And as per the PDF i grepped the metrics.log to see which sourcetype and host consuming more CPU

04-22-2025 05:19:58.017 +0700 INFO Metrics - group=per_sourcetype_regex_cpu, series="cp_log", cpu=604, cpupe=0.0005149352537121802, bytes=1072305900, ev=1172963

04-22-2025 05:19:58.011 +0700 INFO Metrics - group=per_host_regex_cpu, series="networkserver", cpu=596, cpupe=0.0005081981051714273, bytes=1072185809, ev=1172771

Kindly let me know what to do next

livehybrid · ‎04-21-2025

Hi

Increasing autoLBFrequency, maxQueueSize, aggQueueSize, or outputQueueSize in outputs.conf on your heavy forwarders may help temporarily reduce "blocked=true" messages, but these settings do not address the root cause: your indexer(s) are overloaded and unable to keep up with incoming data.

The following will tell you which queues are blocking on which servers:

index=_internal source=*metrics.log blocked=true
| stats count by host, group, name

"blocked=true" in metrics.log means the forwarder cannot send data to the indexer because the indexer is not accepting it fast enough (usually due to CPU, disk, or queue saturation).
Increasing forwarder queue sizes only buffers more data; it does not fix indexer bottlenecks.
The indexer with 99–100% CPU is a clear bottleneck. Upgrading its CPU may help, but if the load is not balanced across all indexers, you may need to investigate why (e.g., uneven load balancing, hot buckets, or misconfiguration).
Lowering autoLBFrequency (e.g., from 10 to 5) can help distribute load more evenly, but will not solve indexer resource exhaustion.

Do not rely solely on queue size increases; this can delay but not prevent data loss if indexers remain overloaded.

Investigate why one indexer is overloaded (check for hot buckets, network issues, or misconfigured load balancing). Understanding *why* the single indexer is blocking is probably the important thing here - it could be a number of things but likely to be either resource issue (e.g. faulty disk) or one of your syslog feeds failing to balance to another indexer.

Is it always the same indexer that runs hot? Or does it change?

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Receiving blocked=true while syslog/heavy forwarder trying to send data through indexer servers

using Enterprise Security

Can’t make it to .conf25? Join us online!

Community Content Calendar, September edition

Splunkbase Unveils New App Listing Management Public Preview

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you a member of the Splunk Community?