- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We see the following messages in the forwarder -
10-18-2017 11:15:29.630 -0500 WARN TailReader - Enqueuing a very large file=<hadoop large file> in the batch reader, with bytes_to_read=4981188783, reading of other large files could be delayed
Where does the forwarder enqueue the files? and is there a way to dequeue them?
When is the BatchReader used and when is the TailingProcessor used?
says -
-- The batch reader is used when the file is over 20 MB in size. Otherwise, the regular tailing processor queue is used. The batch reader only processes one file at a time, while the tailing processor can do many. The limit exists to prevent a bunch of large files for using up all slots and starving out new smaller files.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I think that there is no problem because TailReader and BatchReader are processed separately.
https://wiki.splunk.com/Community:HowIndexingWorks
Do you not want to capture large files of problems?
Or are there any large files that you would like to give priority to importing?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I think that there is no problem because TailReader and BatchReader are processed separately.
https://wiki.splunk.com/Community:HowIndexingWorks
Do you not want to capture large files of problems?
Or are there any large files that you would like to give priority to importing?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You see, this thread relates to Why are the queues being filled up on one indexer?
In this one, I see things from the forwarder's side. It seems to me that the BatchReader process with huge amounts of data, locks on one indexer. Also the BatchReader process seems to be irreversible, because I moved out the flume app and after 6 hours the enqueue files started to flow into Splunk (on the same indexer). Only by uninstalling the forwarder, the issue got cleared.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Is it a universal forwarder reading the files?
if you are using Splunk 6.6 or newer you might be able to use the EVENT_BREAKER and the EVENT_BREAKER_ENABLE in your props.conf to advise the forwarder where the end of each event is, this will allow it to switch output locations without seeing end of file...
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is the universal forwarder reading the files. I think it's TailReader versus BatchReader. What I see is that TailReader is real-time versus BatchReader which is not and also we don't seem to have control of the pending batches..
Then the association of the batch to the indexer. We had in this past week one indexer which ended up receiving 3/4 TB a day of data all being streamed from this single batch single forwarder.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

A single large file occupies one indexer, which may degrade overall performance. The solution is as described by garethatiag.
Also, increasing the number of forwarders and indexers pipelines may be a solution.
For performance troubleshooting, you need to know more about the environment and events. I think it would be better to describe the environment and events accurately and ask again.
