I have been troubleshooting blocked queues, and been gradually eliminating them. My last step was to switch from a heavy forwarder to a universal forwarder, eliminating all processing activities from the forwarder. This helped a lot, but now on my universal forwarder I am getting blocked=true messages for my parsing queue. (In a ten minute period, about 75% of my parsingqueue messages from metrics.log are "blocked").
Log flow: [~150 UniversalForwarders] -> [Cental UniversalForwarder] -> [Indexer] with my "Central UF" being the problem child.
My indexer is showing no issues (all queues at 0).
My network is a 10Mb connection, and my throughput is showing ~10% used so it doesn't seem to be the network.
The CentralUF is passing about 15GiG of data a day at a steady rate.
I boosted the queue size up to 30MB, and I still get the same issue. (Confirmed my 30MB setting actually kicked in).
I have changed my limits.conf [thruput] to be: maxKBps = 0
Second question that I was unable to find answer for: Why is there a parsing queue on the UniversalForwarder, if only heavy forwarders actually do any parsing?
-- Second question that I was unable to find answer for: Why is there a parsing queue on the UniversalForwarder, if only heavy forwarders actually do any parsing?
It refers to the parsing queue of the indexer.
That part throws me off, because my indexer has no "blocks" on any queue for the past week.
The "indexing performance" gui in the DMC shows all pipelines @ 0% on my indexer.
Universal forwarders have limited throughput out of the box. From the documentation:
Universal and lightweight forwarders have a default thruput limit of 256Kbps. This default can be configured in limits.conf. The default value is correct for a forwarder with a low profile, indexing up to ~920 Mb/hour. But in the case of higher indexing volumes, or when the forwarder has to collect the historical logs after the first start, the default might be too low. This could delay the recent events.
Oh, I forgot to add that in my post.
I have also changed thruput from 265 to 0
Are any of the other queues on this forwarder filled as well? If not, the output throttling wouldn't seem to be the issue, as that would result in those downstream queues filling.
If only your parsing queue is filled, it could just be insufficient resources on the forwarder. What does your CPU and memory usage look like?
The only other item filling is my "splunktcpinput", and I assume that is a direct result of my parsing queue filling.
As far as resources are concerned:
My forwarder is @ 1gb ram / 4gb, and 1 - 2% of CPU on the single core used.
I know this is very old post. I am seeing the same problem . Only name=execprocessorinternalq and parsingQueue is blocked and that too only for one forwarders . Other are working fine . Deployment is UF->HFs->IDXs
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=execprocessorinternalq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=162, largest_size=162, smallest_size=162
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=fschangemanager_queue, max_size_kb=5120, current_size_kb=1, current_size=7, largest_size=7, smallest_size=7
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=httpinputq, max_size_kb=0, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=indexqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=nullqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=parsingqueue, blocked=true, max_size_kb=10240, current_size_kb=10239, current_size=308, largest_size=308, smallest_size=308
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=splunktcpin, max_size_kb=0, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=structuredparsingqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
05-20-2020 19:16:10.812 +1000 INFO Metrics - group=queue, name=tcpin_queue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
I try to remember to post my resolutions after things get fixed, but unfortunately this one slipped through the cracks. It was a long time ago, and if I remember correctly, the fix was indirect. (Its been a while, but I THINK this is what happened)
To resolve:
I feel that the main issue causing this whole thing was the slow storage on the indexers. Nothing was really reporting queues full except the forwarder, but reducing the incoming logs fixed it immediately, and upgrading the hardware we are able to push 20-30gb a day with no issues.
The slow network has since been upgraded (latency is same, pipe is fatter) but I am not certain that was ever the issue.