Splunk Enterprise Security

Receiving blocked=true while syslog/heavy forwarder trying to send data through indexer servers

sureshkumaar
Path Finder

Hi All,

       I have 4 Heavy forwarder servers sending data through 5 indexers

server1 acts as syslog server which has autoLBFrequency as 10 and maxQueueSize as 1000MB

server2 acts as syslog and heavy forwarder which has autoLBFrequency as 10 and maxQueueSize as 500MB

server3 acts heavy forwarder which has autoLBFrequency as 10 and maxQueueSize as 500MB

server4 acts heavy forwarder which has autoLBFrequency as 10 and maxQueueSize as 500MB

   Receiving blocked=true in metrics.log while syslog/heavy forwarder trying to send data through indexer servers. Due to this index ingestion is getting delayed and data is coming to Splunk 2-3 hours late.

        And in one of the 5 indexer servers CPU is always highly utilized from 99-100% consistently which has 24 CPU, other indexer servers also running with 24 CPU.

         Planning to upgrade highly utilized indexer server alone from 24 to 32

        Kindly suggest by updating below in outputs.conf will reduce/stop the "blocked=true" in metrics.log and CPU load on indexer will be normal before upgrading the CPU.

        OR we need to do both, changes in outputs.conf and upgrading the CPU. If both can be done which is the first we can try. Kindly help.

autoLBFrequency = 5
maxQueueSize = 1000MB
aggQueueSize = 7000
outputQueueSize = 7000

Labels (1)
0 Karma

sureshkumaar
Path Finder

as per the monitoring console could see indexing queue and splunktcpin queue is high

Screenshot 2025-04-24 094537.png

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @sureshkumaar 

Further to my last reply - there are also a couple of worthwhile resources here which give an overview of how to identify and deal with blocked queues.

https://docs.splunk.com/Documentation/Splunk/8.2.4/Deploy/Datapipeline

How to Troubleshoot Blocked Ingestion Pipeline Queues with Indexers and Forwarders - https://conf.sp...

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

sureshkumaar
Path Finder

Thanks @livehybrid  for your inputs, I did checked the blocked=true for all the 4 heavy forwarder and could see in one of the heavy forwarder which acts as syslog server collecting the network related data where the typing queue is found which is considered as a bottleneck as i went through the PDF.

And as per the PDF i grepped the metrics.log to see which sourcetype and host consuming more CPU 

04-22-2025 05:19:58.017 +0700 INFO Metrics - group=per_sourcetype_regex_cpu, series="cp_log", cpu=604, cpupe=0.0005149352537121802, bytes=1072305900, ev=1172963

04-22-2025 05:19:58.011 +0700 INFO Metrics - group=per_host_regex_cpu, series="networkserver", cpu=596, cpupe=0.0005081981051714273, bytes=1072185809, ev=1172771

Screenshot 2025-04-22 114936.png

Kindly let me know what to do next

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi

Increasing autoLBFrequency, maxQueueSize, aggQueueSize, or outputQueueSize in outputs.conf on your heavy forwarders may help temporarily reduce "blocked=true" messages, but these settings do not address the root cause: your indexer(s) are overloaded and unable to keep up with incoming data.

The following will tell you which queues are blocking on which servers:

index=_internal source=*metrics.log blocked=true
| stats count by host, group, name

 

  • "blocked=true" in metrics.log means the forwarder cannot send data to the indexer because the indexer is not accepting it fast enough (usually due to CPU, disk, or queue saturation).
  • Increasing forwarder queue sizes only buffers more data; it does not fix indexer bottlenecks.
  • The indexer with 99–100% CPU is a clear bottleneck. Upgrading its CPU may help, but if the load is not balanced across all indexers, you may need to investigate why (e.g., uneven load balancing, hot buckets, or misconfiguration).
  • Lowering autoLBFrequency (e.g., from 10 to 5) can help distribute load more evenly, but will not solve indexer resource exhaustion.

 

Do not rely solely on queue size increases; this can delay but not prevent data loss if indexers remain overloaded.

Investigate why one indexer is overloaded (check for hot buckets, network issues, or misconfigured load balancing). Understanding *why* the single indexer is blocking is probably the important thing here - it could be a number of things but likely to be either resource issue (e.g. faulty disk) or one of your syslog feeds failing to balance to another indexer.

Is it always the same indexer that runs hot? Or does it change?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

.conf25 Global Broadcast: Don’t Miss a Moment

Hello Splunkers, .conf25 is only a click away.  Not able to make it to .conf25 in person? No worries, you can ...

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...