Getting Data In

When is the BatchReader used and when is the TailingProcessor used?


I have a UniversalForwarder that is exporting hourly data to a monitored folder. Sometimes there are a lot of records (50,000+) and sometimes there is only a few thousand. I have noticed that when the file is bigger that the BatchReader appears to process the file and when the file is smaller the TailingProcessor processes the file.

Is there a limit to this that can be set?

0 Karma

Splunk Employee
Splunk Employee

The batch reader is used when the file is over 20 mb in size. Otherwise, the regular tailing processor queue is used. The batch reader only processes one file at a time, while the tailing processor can do many. The limit exists to prevent a bunch of large files for using up all slots and starving out new smaller files.

The threshold can be changed in limits.conf:

min_batch_size_bytes = 10485760

to set it to 10 MB, for example.