Scenario: 1x load balancer, 2x light forwarders, 1x indexer.

The goal is to make it possible to reboot a single Splunk instance and not lose events. Here is the output.conf on the forwarders:
[tcpout:x.x.x.x_x]
autoLB = false
server = x.x.x.x_x
useACK = true
maxQueueSize = 400MB
sendCookedData = true
To test this scenario I produce a controlled number of events and then verify they are indexed. Restarting either forwarder while sending events does not incur loss. Though, if I restart the indexer I always lose a small number of events.
I have verified that the events are received by the forwarders.
It is my understanding that using maxQueueSize and autoLB together should eliminate lost events. (http://docs.splunk.com/Documentation/Splunk/4.3.3/Deploy/Protectagainstlossofin-flightdata)
Here is what the splunk.log looks like:
06-28-2012 10:17:12.818 -0900 WARN  TcpOutputFd - Connect to x.x.x.x:x failed. Connection refused
06-28-2012 10:17:12.818 -0900 ERROR TcpOutputFd - Connection to host=x.x.x.x:x failed
06-28-2012 10:17:12.818 -0900 WARN  TcpOutputFd - Connect to x.x.x.x:x failed. Connection refused
06-28-2012 10:17:12.818 -0900 ERROR TcpOutputFd - Connection to host=x.x.x.x:x failed
06-28-2012 10:17:12.818 -0900 WARN  TcpOutputProc - Applying quarantine to idx=x.x.x.x:x numberOfFailures=4
06-28-2012 10:17:12.818 -0900 WARN  TcpOutputProc - Applying quarantine to idx=x.x.x.x:x numberOfFailures=5
06-28-2012 10:18:42.847 -0900 INFO  TcpOutputProc - Removing quarantine from idx=x.x.x.x:x
06-28-2012 10:18:42.850 -0900 INFO  TcpOutputProc - Connected to idx=x.x.x.x:x using ACK.
In this particular test I sent 150 events, 144 were indexed
What am I missing?
EDIT:
The indexer log is shown below. The log begins as I issue the restart command. In this particular test 67 of 80 events were indexed.
06-28-2012 17:11:10.296 -0400 INFO  IndexProcessor - handleSignal 2: Disabling streaming searches.
06-28-2012 17:11:10.296 -0400 INFO  IndexProcessor - request state change RUN to SHUTDOWN_SIGNALED
06-28-2012 17:11:10.296 -0400 WARN  HTTPServer - Disabling all acceptors due to shutdown
06-28-2012 17:11:10.951 -0400 INFO  IndexProcessor - active realtime streams have hit 0 during shutdown
06-28-2012 17:11:11.288 -0400 WARN  HTTPServer - Shutdown triggered via callback.
06-28-2012 17:11:11.288 -0400 WARN  HTTPServer - HTTPServer shutting down. Terminating eloop.
06-28-2012 17:11:11.288 -0400 WARN  HTTPServer - Closing all acceptors due to shutdown
06-28-2012 17:11:11.288 -0400 INFO  loader - Shutdown HTTPDispatchThread
06-28-2012 17:11:11.289 -0400 INFO  ShutdownHandler - Shutting down splunkd
06-28-2012 17:11:11.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Thruput"
06-28-2012 17:11:11.289 -0400 INFO  ShutdownHandler - Component 'ThruputProcessor' completed shutdown immediately
06-28-2012 17:11:11.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_TcpInput"
06-28-2012 17:11:11.289 -0400 INFO  TcpInputProc - Stopping IPv4 port x
06-28-2012 17:11:11.289 -0400 INFO  TcpInputProc - Stopping IPv4 port x
06-28-2012 17:11:11.289 -0400 INFO  TcpInputProc - Stopping IPv4 port x
06-28-2012 17:11:11.289 -0400 INFO  TcpInputProc - Stopping IPv4 port x
06-28-2012 17:11:11.289 -0400 INFO  TcpInputProc - Stopping IPv4 port x
06-28-2012 17:11:11.289 -0400 INFO  TcpInputProc - Stopping IPv4 port x
06-28-2012 17:11:12.084 -0400 INFO  TcpInputProc - Waiting for connection from src=x to close before shutting down TcpInputProcessor.
06-28-2012 17:11:12.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4587; will try again
06-28-2012 17:11:14.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4588; will try again
06-28-2012 17:11:17.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4589; will try again
06-28-2012 17:11:21.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4590; will try again
06-28-2012 17:11:22.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4591; will try again
06-28-2012 17:11:24.283 -0400 WARN  ProcessTracker - Failed to start process, PGID 4592; will try again
06-28-2012 17:11:26.174 -0400 ERROR DispatchCommand - Failed to start the search process.
06-28-2012 17:11:28.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4593; will try again
06-28-2012 17:11:36.283 -0400 WARN  ProcessTracker - Failed to start process, PGID 4594; will try again
06-28-2012 17:11:40.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4595; will try again
06-28-2012 17:11:41.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4596; will try again
06-28-2012 17:11:43.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4597; will try again
06-28-2012 17:11:45.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4598; will try again
06-28-2012 17:11:45.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4599; will try again
06-28-2012 17:11:52.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4600; will try again
06-28-2012 17:11:52.287 -0400 WARN  ProcessTracker - Failed to start process, PGID 4601; will try again
06-28-2012 17:11:54.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4602; will try again
06-28-2012 17:11:57.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4603; will try again
06-28-2012 17:11:57.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4604; will try again
06-28-2012 17:11:57.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4605; will try again
06-28-2012 17:11:57.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4606; will try again
06-28-2012 17:11:57.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4607; will try again
06-28-2012 17:12:00.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4608; will try again
06-28-2012 17:12:02.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4609; will try again
06-28-2012 17:12:02.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4610; will try again
06-28-2012 17:12:04.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4611; will try again
06-28-2012 17:12:06.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4612; will try again
06-28-2012 17:12:06.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4613; will try again
06-28-2012 17:12:06.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4614; will try again
06-28-2012 17:12:06.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4615; will try again
06-28-2012 17:12:06.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4616; will try again
06-28-2012 17:12:06.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4617; will try again
06-28-2012 17:12:06.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4618; will try again
06-28-2012 17:12:06.282 -0400 WARN  ProcessTracker - Failed to start process, PGID 4619; will try again
06-28-2012 17:12:06.282 -0400 WARN  ProcessTracker - Failed to start process, PGID 4620; will try again
06-28-2012 17:12:06.282 -0400 WARN  ProcessTracker - Failed to start process, PGID 4621; will try again
06-28-2012 17:12:06.282 -0400 WARN  ProcessTracker - Failed to start process, PGID 4622; will try again
06-28-2012 17:12:06.303 -0400 WARN  ProcessTracker - Failed to start process, PGID 4623; will try again
06-28-2012 17:12:11.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4624; will try again
06-28-2012 17:12:18.283 -0400 WARN  ProcessTracker - Failed to start process, PGID 4625; will try again
06-28-2012 17:12:18.284 -0400 WARN  ProcessTracker - Failed to start process, PGID 4626; will try again
06-28-2012 17:12:20.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4627; will try again
06-28-2012 17:12:23.279 -0400 WARN  ProcessTracker - Failed to start process, PGID 4628; will try again
06-28-2012 17:12:23.283 -0400 WARN  ProcessTracker - Failed to start process, PGID 4629; will try again
06-28-2012 17:12:23.287 -0400 WARN  ProcessTracker - Failed to start process, PGID 4630; will try again
06-28-2012 17:12:23.291 -0400 WARN  ProcessTracker - Failed to start process, PGID 4631; will try again
06-28-2012 17:12:23.294 -0400 WARN  ProcessTracker - Failed to start process, PGID 4632; will try again
06-28-2012 17:12:25.278 -0400 WARN  timeinvertedIndex - splunk-optimize is already running for this index /opt/splunk/var/lib/splunk/_internaldb/db/hot_v1_677.
06-28-2012 17:12:25.278 -0400 WARN  timeinvertedIndex - splunk-optimize is already running for this index /opt/splunk/var/lib/splunk/syslog/db/hot_v1_358.
06-28-2012 17:12:26.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4633; will try again
06-28-2012 17:12:26.443 -0400 ERROR DispatchCommand - Failed to start the search process.
06-28-2012 17:12:26.454 -0400 ERROR DispatchCommand - Failed to start the search process.
06-28-2012 17:12:27.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4634; will try again
06-28-2012 17:12:28.278 -0400 WARN  timeinvertedIndex - splunk-optimize is already running for this index /opt/splunk/var/lib/splunk/perfmon/db/hot_v1_12.
06-28-2012 17:12:32.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4635; will try again
06-28-2012 17:12:32.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4636; will try again
06-28-2012 17:12:32.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4637; will try again
06-28-2012 17:12:32.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4638; will try again
06-28-2012 17:12:32.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4639; will try again
06-28-2012 17:12:32.280 -0400 WARN  ProcessTracker - Failed to start process, PGID 4640; will try again
06-28-2012 17:12:32.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4641; will try again
06-28-2012 17:12:32.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4642; will try again
06-28-2012 17:12:32.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4643; will try again
06-28-2012 17:12:32.281 -0400 WARN  ProcessTracker - Failed to start process, PGID 4644; will try again
06-28-2012 17:12:32.281 -0400 WARN  timeinvertedIndex - splunk-optimize is already running for this index /opt/splunk/var/lib/splunk/nagios/db/hot_v1_27.
06-28-2012 17:12:37.278 -0400 WARN  ProcessTracker - Failed to start process, PGID 4645; will try again
06-28-2012 17:12:41.288 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_TcpOutput"
06-28-2012 17:12:41.288 -0400 INFO  ShutdownHandler - Component 'TcpOutputProcessor' completed shutdown immediately
06-28-2012 17:12:41.288 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_UdpInput"
06-28-2012 17:12:41.288 -0400 INFO  ShutdownHandler - Component 'UDPInputProcessor' completed shutdown immediately
06-28-2012 17:12:41.288 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_FifoInput"
06-28-2012 17:12:41.288 -0400 INFO  UDPInputProcessor - Unregistering metrics callback for: udpin_connections
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_WinEventLogInput"
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Scheduler"
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - Component 'SavedSearchScheduler' completed shutdown immediately
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - Component 'DispatchReaper' completed shutdown immediately
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Tailing"
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_SyslogOutput"
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - Component 'SyslogOutputProcessor' completed shutdown immediately
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_HTTPOutput"
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - Component 'HTTPOutputProcessor' completed shutdown immediately
06-28-2012 17:12:41.289 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_TailingXP"
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor - Received shutdown signal.
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor - Will reconfigure input.
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor -   Calling addFromAnywhere in TailWatcher=0x7fb26886ab40.
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor -   Shutting down with TailingShutdownActor=0x878fef0 and TailWatcher=0x7fb26886ab40.
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor -   Pausing BatchReader module...
06-28-2012 17:12:41.289 -0400 INFO  BatchReader - State transitioning from 0 to 1 (pseudoPause).
06-28-2012 17:12:41.289 -0400 INFO  BatchReader - State transitioning from 1 to 2 (pseudoPause).
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor -   Removing TailWatcher from eventloop...
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor -     ...removed.
06-28-2012 17:12:41.289 -0400 INFO  TailingProcessor -   Eventloop terminated successfully.
06-28-2012 17:12:41.292 -0400 INFO  TailingProcessor - Signalling shutdown complete.
06-28-2012 17:12:41.292 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_BatchReader"
06-28-2012 17:12:41.292 -0400 INFO  BatchReader - State transitioning from 2 to 3 (signalShutdown).
06-28-2012 17:12:41.292 -0400 INFO  BatchReader - Terminating threadpool.
06-28-2012 17:12:41.293 -0400 INFO  ShutdownHandler - Component 'BatchReader' completed shutdown immediately
06-28-2012 17:12:41.293 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Archive"
06-28-2012 17:12:41.943 -0400 WARN  pipeline - Exiting pipeline udp gracefully: got eExit from processor udp
06-28-2012 17:12:42.067 -0400 WARN  pipeline - Exiting pipeline fifo gracefully: got eExit from processor fifo
06-28-2012 17:12:42.085 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_AuditTrailManager"
06-28-2012 17:12:42.085 -0400 INFO  ShutdownHandler - Component 'AuditTrailManager' completed shutdown immediately
06-28-2012 17:12:42.085 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_AuditTrailQueueServiceThread"
06-28-2012 17:12:42.085 -0400 INFO  ShutdownHandler - Component 'QueueServiceThread' completed shutdown immediately
06-28-2012 17:12:42.085 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_FSChangeMonitor"
06-28-2012 17:12:42.085 -0400 WARN  pipeline - Exiting pipeline archivePipe gracefully: got eExit from processor archiveProcessor
06-28-2012 17:12:42.106 -0400 INFO  TcpInputProc - Closing splunk 2 splunk IPv4 port x
06-28-2012 17:12:42.106 -0400 INFO  TcpInputProc - Closing splunk 2 splunk IPv4 port x
06-28-2012 17:12:42.106 -0400 INFO  TcpInputProc - Closing splunk 2 splunk IPv4 port x
06-28-2012 17:12:42.106 -0400 INFO  TcpInputProc - Closing splunk 2 splunk IPv4 port x
06-28-2012 17:12:42.106 -0400 INFO  TcpInputProc - Closing splunk 2 splunk IPv4 port x
06-28-2012 17:12:42.106 -0400 INFO  TcpInputProc - Closing splunk 2 splunk IPv4 port x
06-28-2012 17:12:42.106 -0400 ERROR TcpInputProc - Error encountered for connection from src=10.1.120.48:38420. Local side shutting down
06-28-2012 17:12:42.713 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_FSChangeManagerProcessor"
06-28-2012 17:12:42.714 -0400 INFO  ShutdownHandler - Component 'FSChangeManagerProcessor' completed shutdown immediately
06-28-2012 17:12:42.714 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Select"
06-28-2012 17:12:42.714 -0400 INFO  ShutdownHandler - Component 'PipeFlusher' completed shutdown immediately
06-28-2012 17:12:42.714 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Database1"
06-28-2012 17:12:42.714 -0400 INFO  IndexProcessor - request state change SHUTDOWN_SIGNALED to SHUTDOWN_IN_PROGRESS
06-28-2012 17:12:42.714 -0400 INFO  PipeFlusher - Flushing pipelines...
06-28-2012 17:12:42.714 -0400 INFO  PipeFlusher - Finished triggering pipeline flush.
06-28-2012 17:12:42.714 -0400 INFO  IndexProcessor - shutting down: start
06-28-2012 17:12:42.714 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: _audit
06-28-2012 17:12:42.716 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: _internal
06-28-2012 17:12:42.721 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: _thefishbucket
06-28-2012 17:12:42.721 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: blackberry
06-28-2012 17:12:42.722 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: email
06-28-2012 17:12:42.722 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: history
06-28-2012 17:12:42.722 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: main
06-28-2012 17:12:42.729 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: modsecurity
06-28-2012 17:12:42.729 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: msexchange
06-28-2012 17:12:42.778 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: nagios
06-28-2012 17:12:42.781 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: os
06-28-2012 17:12:42.782 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: perfmon
06-28-2012 17:12:42.785 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: rss
06-28-2012 17:12:42.786 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary
06-28-2012 17:12:42.786 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary_forwarders
06-28-2012 17:12:42.788 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary_hosts
06-28-2012 17:12:42.789 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary_indexers
06-28-2012 17:12:42.790 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary_pools
06-28-2012 17:12:42.790 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary_sources
06-28-2012 17:12:42.793 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: summary_sourcetypes
06-28-2012 17:12:42.796 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: syslog
06-28-2012 17:12:42.799 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: test
06-28-2012 17:12:42.799 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: windows
06-28-2012 17:12:42.799 -0400 INFO  databasePartitionPolicy - Handling shutdown or signal in DatabasePartitionPolicy: _blocksignature
06-28-2012 17:12:42.805 -0400 INFO  BucketMover - 0 jobs canceled from BucketMoverExecutor
06-28-2012 17:12:42.805 -0400 INFO  IndexProcessor - request state change SHUTDOWN_IN_PROGRESS to SHUTDOWN_COMPLETE
06-28-2012 17:12:42.805 -0400 INFO  IndexProcessor - shutting down: end
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_LoadLDAPUsers"
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'LoadLDAPUsersThread' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_MetricsManager"
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'MetricsManager' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Pipeline"
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:dev-null' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:indexerPipe' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:exec' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:fschangemanager' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:tcp' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:merging' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:typing' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:stashparsing' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'Pipeline:parsing' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Queue"
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'parsingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'parsingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'parsingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'parsingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'parsingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'indexQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'parsingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'aggQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'typingQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - Component 'indexQueue' completed shutdown immediately
06-28-2012 17:12:42.805 -0400 INFO  ShutdownHandler - shutting down level "ShutdownLevel_Exec"
06-28-2012 17:12:42.805 -0400 WARN  pipeline - Exiting pipeline indexerPipe gracefully: got eExit from processor indexer
06-28-2012 17:12:42.807 -0400 INFO  ShutdownHandler - Shutdown complete in 91.52 seconds
06-28-2012 17:12:42.831 -0400 WARN  pipeline - Exiting pipeline exec gracefully: got eExit from processor exec
06-28-2012 17:12:43.075 -0400 INFO  TcpInputProc - Finishing shutdown. Timed out in 90 seconds
06-28-2012 17:12:43.075 -0400 WARN  pipeline - Exiting pipeline tcp gracefully: got eExit from processor tcp
06-28-2012 17:12:43.526 -0400 WARN  pipeline - Exiting pipeline stashparsing forcefully
06-28-2012 17:12:43.581 -0400 WARN  pipeline - Exiting pipeline dev-null forcefully
06-28-2012 17:12:43.713 -0400 WARN  pipeline - Exiting pipeline fschangemanager gracefully: got eExit from processor fschangemanager
06-28-2012 17:12:43.714 -0400 WARN  pipeline - Exiting pipeline parsing forcefully
06-28-2012 17:12:43.714 -0400 WARN  pipeline - Exiting pipeline merging forcefully
06-28-2012 17:12:43.714 -0400 WARN  pipeline - Exiting pipeline typing forcefully
06-28-2012 17:12:43.714 -0400 WARN  loader - All pipelines finished.
EDIT # 2
Added the following to the inputs.conf:
queueSize=50KB
persistentQueueSize=400MB
Doing this resulted in duplication: 100 events triggered, 169 indexed
Screenshot:

 
		
		
		
		
		
	
			
		
		
			
					
		I believe that if you just use "useACK =true" in your forwarder's outputs.conf (no need to touch persistentQueueSize in your forwarder's inputs.conf), you would not suffer from any events lost even when restarting your indexers.
When using that "useACK=true", your indexer get some duplicates events due to the momentary discontinuity between your forwarder and indexer, but this is far better than losing some events. And to deal with those duplicates, you could just use dedup to get the unique events out of it. This should just solve it.
useACK=true guarantees full delivery, try that to see how it goes
I got an answer from Splunk support. It turns out that duplication is expected behavior.
Chad,
Unfortunately, there is no solution to
the duplicated events. With the
"useACK" functionality, there's times
where the forwarder doesn't get the
ack message after the indexer has
written the event to the log. This is
expected behavior.Brian
Here is the results of Splunk support's test:
24460 Messages Sent 24444 Messages received by the forwarder (tcpdump) 25276 Messages received by the indexer 832 Duplicated messages
 
		
		
		
		
		
	
			
		
		
			
					
		I believe that if you just use "useACK =true" in your forwarder's outputs.conf (no need to touch persistentQueueSize in your forwarder's inputs.conf), you would not suffer from any events lost even when restarting your indexers.
When using that "useACK=true", your indexer get some duplicates events due to the momentary discontinuity between your forwarder and indexer, but this is far better than losing some events. And to deal with those duplicates, you could just use dedup to get the unique events out of it. This should just solve it.
useACK=true guarantees full delivery, try that to see how it goes
maybe use "persistentqueue"?
http://docs.splunk.com/Documentation/Splunk/latest/Data/Usepersistentqueues
I get a weird duplication issue when I try that. Triggered 100 events, see 169 on the indexer. (see screenshot above)
@Ayn - No, there are no TCP errors during normal operation. I have not seen events lost during normal operation - only during restarts.
Umm, the portions of splunkd.log I was referring to was rather what happens BEFORE you issue a restart command - that is, in what is supposed to be normal operations, do you see errors on the indexer side with regards to TCP connections?
Do you see corresponding events on the indexer that might give you clues about why it is refusing connections?
