Hello,
I have setup intermediate forwarding.
Here is a quick overview of the infrastructure
light-forwarder -> intermediate-forwarder -> search-head
light-forwarder is forwarding data to intermediate-forwarder, it is working with SSL communication enabled.
but
intermediate-forwarder to search-head is not working.
I've checked metrics.log on intermediate-forwarder, it seems that data is stuck on queue and is not being sent to search-head. Here's a small view of metrics.log
10-26-2012 03:58:04.629 +0000 INFO Metrics - group=queue, name=tcpout_vpc-splunk-search-head-daniel.nestlabs.com_9998, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855
10-26-2012 03:59:01.043 +0000 INFO Metrics - group=queue, name=tcpout_vpc-splunk-search-head-daniel.nestlabs.com_9998, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855
10-26-2012 03:59:57.197 +0000 INFO Metrics - group=queue, name=tcpout_vpc-splunk-search-head-daniel.nestlabs.com_9998, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855
10-26-2012 04:00:49.860 +0000 INFO Metrics - group=queue, name=tcpout_vpc-splunk-search-head-daniel.nestlabs.com_9998, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855
10-26-2012 04:01:45.684 +0000 INFO Metrics - group=queue, name=tcpout_vpc-splunk-search-head-daniel.nestlabs.com_9998, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855
Digging deeper I found some of the queues being blocked
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=tcpout_vpc-splunk-search-head-daniel.nestlabs.com_9998, max_size=512000, current_size=511855, largest_size=0, smallest_size=511855
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=aeq, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=aq, max_size_kb=10240, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=aggqueue, blocked=true, max_size_kb=1024, current_size_kb=1023, current_size=2596, largest_size=2596, smallest_size=2596
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=auditqueue, blocked=true, max_size_kb=500, current_size_kb=499, current_size=1559, largest_size=1559, smallest_size=1559
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=fschangemanager_queue, max_size_kb=5120, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=indexqueue, blocked=true, max_size_kb=500, current_size_kb=499, current_size=1354, largest_size=1354, smallest_size=1354
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=nullqueue, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=0, smallest_size=0
10-26-2012 04:10:00.304 +0000 INFO Metrics - group=queue, name=parsingqueue, max_size_kb=6144, current_size_kb=6142, current_size=5597, largest_size=5597, smallest_size=5597
Any thoughts on why it stuck on output queue?
Thanks,
Daniel
to investigate, turn on the debug mode on the SSL modules, on both ends and search for errors in splunkd.log
see http://docs.splunk.com/Documentation/Splunk/4.3.4/Troubleshooting/Enabledebuglogging#In_Splunk_Web
The last queue is the indexqueue, that is blocked.
So you must have something wacky :
in the outputs configuration of the intermediate heavy forwarder (check for extra destinations, forward to third party, or cloning)
or on the indexer : how are the queues there,try to restart it. Is it able to write to disk or index the local logs ?
Here is my outputs.conf on intermediate forwarder:
[tcpout]
defaultGroup = vpc-splunk-search-head-daniel.nestlabs.com_9998
indexAndForward = 0
[tcpout:vpc-splunk-search-head-daniel.nestlabs.com_9998]
autoLB = true
server = vpc-splunk-search-head-daniel.nestlabs.com:9998
maxQueueSize = 5MB
[tcpout-server://vpc-splunk-search-head-daniel.nestlabs.com:9998]
sslVerifyServerCert = true
sslCertPath = $SPLUNK_HOME/etc/certs/searchhead/splunk.pem
sslCommonNameToCheck = splunk.nestlabs.com
sslPassword = ***************
sslRootCAPath = $SPLUNK_HOME/etc/certs/searchhead/cacert.pem