I just solved my own variation of this problem...
Key discovery: empirically, at least in v6.5.0, the helpful behavior of time_before_close applies only to the tailing processor and not to the batch reader. (see also https://answers.splunk.com/answers/109779/when-is-the-batchreader-used-and-when-is-the-tailingprocessor-used.html)
During peak periods, one particular log file of mine grows so quickly that my UF would decide to read it using the batch reader, and each hour I saw hundreds of log messages from that file split into two Splunk events each, along with contemporaneous "INFO TailReader - Batch input finished reading file" messages in splunkd.log.
I don't want rsyslog working extra hard to write individual syslog messages to the filesystem atomically, so my solution was to stop using the batch reader:
[default]
# Set very high so we never use the batch reader.
min_batch_size_bytes = 1000000000000
[inputproc]
# Set higher than the number of simultaneously active log files to
# avoid splitting events.
max_fd = 10000
... View more