Solved: Where does the forwarder enqueue files?

ddrillic · ‎10-18-2017

We see the following messages in the forwarder -

10-18-2017 11:15:29.630 -0500 WARN  TailReader - Enqueuing a very large file=<hadoop large file> in the batch reader, with bytes_to_read=4981188783, reading of other large files could be delayed

Where does the forwarder enqueue the files? and is there a way to dequeue them?

When is the BatchReader used and when is the TailingProcessor used?

says -

-- The batch reader is used when the file is over 20 MB in size. Otherwise, the regular tailing processor queue is used. The batch reader only processes one file at a time, while the tailing processor can do many. The limit exists to prevent a bunch of large files for using up all slots and starving out new smaller files.

HiroshiSatoh · ‎10-19-2017

I think that there is no problem because TailReader and BatchReader are processed separately.
https://wiki.splunk.com/Community:HowIndexingWorks

Do you not want to capture large files of problems?
Or are there any large files that you would like to give priority to importing?

View solution in original post

HiroshiSatoh · ‎10-19-2017

I think that there is no problem because TailReader and BatchReader are processed separately.
https://wiki.splunk.com/Community:HowIndexingWorks

Do you not want to capture large files of problems?
Or are there any large files that you would like to give priority to importing?

ddrillic · ‎10-19-2017

You see, this thread relates to Why are the queues being filled up on one indexer?

In this one, I see things from the forwarder's side. It seems to me that the BatchReader process with huge amounts of data, locks on one indexer. Also the BatchReader process seems to be irreversible, because I moved out the flume app and after 6 hours the enqueue files started to flow into Splunk (on the same indexer). Only by uninstalling the forwarder, the issue got cleared.

gjanders · ‎10-19-2017

Is it a universal forwarder reading the files?
if you are using Splunk 6.6 or newer you might be able to use the EVENT_BREAKER and the EVENT_BREAKER_ENABLE in your props.conf to advise the forwarder where the end of each event is, this will allow it to switch output locations without seeing end of file...

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

ddrillic · ‎10-22-2017

It is the universal forwarder reading the files. I think it's TailReader versus BatchReader. What I see is that TailReader is real-time versus BatchReader which is not and also we don't seem to have control of the pending batches..

Then the association of the batch to the indexer. We had in this past week one indexer which ended up receiving 3/4 TB a day of data all being streamed from this single batch single forwarder.

HiroshiSatoh · ‎10-23-2017

A single large file occupies one indexer, which may degrade overall performance. The solution is as described by garethatiag.
Also, increasing the number of forwarders and indexers pipelines may be a solution.

For performance troubleshooting, you need to know more about the environment and events. I think it would be better to describe the environment and events accurately and ask again.

Where does the forwarder enqueue files?

Fueling your curiosity with new Splunk ILT and eLearning courses

Splunk AI Assistant for SPL 1.1.0 | Now Personalized to Your Environment for Greater ...

Unleash Unified Security and Observability with Splunk Cloud Platform