Hello
I have a Universal Forwarder that acts as an intermediary forwarder between about 200 other UFs and the Indexer. Data from the client UFs gets delayed or not sent at all and digging through the logs, I noticed this:
On the client UF:
09-29-2012 05:42:07.620 -0400 WARN TcpOutputFd - Connect to 10.xx.xx.60:9997 failed. No connection could be made because the target machine actively refused it.
09-29-2012 05:42:07.620 -0400 ERROR TcpOutputFd - Connection to host=10.xx.xx.60:9997 failed
09-29-2012 05:42:07.620 -0400 WARN TcpOutputProc - Applying quarantine to idx=10.xx.xx.60:9997 numberOfFailures=52
On the intermediary UF:
splunkd.log:
09-29-2012 06:35:48.722 -0400 INFO TailingProcessor - Could not send data to output queue (parsingQueue), retrying...
09-29-2012 06:35:49.504 -0400 INFO TailingProcessor - ...continuing.
metrics.log:
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=fschangemanager_queue, max_size_kb=5120, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=indexqueue, max_size_kb=10240, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=nullqueue, max_size_kb=10240, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=parsingqueue, blocked=true, max_size_kb=512, current_size_kb=511, current_size=409, largest_size=445, smallest_size=163
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=splunktcpin, max_size_kb=500, current_size_kb=423, current_size=218, largest_size=442, smallest_size=152
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=tcpin_queue, max_size_kb=10240, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
09-29-2012 06:27:24.509 -0400 INFO Metrics - group=queue, name=winparsing, max_size_kb=10240, current_size_kb=0, current_size=0, largest_size=1, smallest_size=0
I already updated the etc\system\local\server.conf on the intermediary UF adding this:
[queue]
maxSize = 10MB
[queue=parsingqueue]
maxSize = 10MB
[queue=splunktcpin]
maxSize = 10MB
however, as you can see from the log, those two queues are the only ones that seem to not get affected, showing max_size_kb to be 500-512 KB, instead of the 10 MB I set. Any idea why? Or am I not even on the right track?
Looking at your metrics log, i see that the parsing queue is being blocked. Check here: http://splunk-base.splunk.com/answers/45676/what-causes-queues-on-indexer-to-block
for some ideas on a blocked parsing queue. This is a little old, but http://wiki.splunk.com/Community:TroubleshootingBlockedQueues
. And finally, http://splunk-base.splunk.com/answers/38218/universal-forwarder-parsingqueue-kb-size
(notice how queue=parsingQueue [BIG Q] ) Might be case sensitive.
Looking at your metrics log, i see that the parsing queue is being blocked. Check here: http://splunk-base.splunk.com/answers/45676/what-causes-queues-on-indexer-to-block
for some ideas on a blocked parsing queue. This is a little old, but http://wiki.splunk.com/Community:TroubleshootingBlockedQueues
. And finally, http://splunk-base.splunk.com/answers/38218/universal-forwarder-parsingqueue-kb-size
(notice how queue=parsingQueue [BIG Q] ) Might be case sensitive.
Thank you MuS! It turns out the queue referred to as "splunktcpin" in the metrics.log is actually the splunk listener on port 9997, defined by .\etc\system\local\inputs.conf with this stanza
[splunktcp://9997]
queueSize = 10MB #<-- this is my change
The other one I had issues with, parsingQueue (which alacercogitatus helped me with), is defined in a different conf file - .\etc\system\local\server.conf, with:
[queue=parsingQueue]
maxSize = 30MB #<-- this is my change
A little confusing... 😞
Thank you both!
and if this does not work, then the correct name for the queue is 'tcpin_queue'
cheers,
MuS
Hi, according to the docs it can be set in inputs.conf in the [TCP] Stanza http://docs.splunk.com/Documentation/Splunk/4.3.4/admin/Inputsconf
queueSize =
* Maximum size of the in-memory input queue.
* Defaults to 500KB.
I can't seem to find the correct case either. Could be splunkTcpin, or SplunkTcpIn, I'd try a few different cases and see if any of them work. I'll keep looking as well.
Thank you alacercogitatus - parsingQueue seems to have done the trick for that queue (I guess in the metrics.log the case is not correct). The only other one that I see still getting blocked relatively frequently is the splunktcpin, which also does not seem to get affected by the global [queue] stanza. I tried splunkTcpIn - that does not seem to be it. Any suggestions on what the case for this one would be? I can't seem to find documentation on that (other than all lower case)