I think the sourcetype was the key.
Once the backlog has cleared I would suggest backing off the max_fd to a value in the hundreds
as too high a value could well affect performance.
As you know the max_fd defines the maximum number of file descriptors that Splunk will keep open, to capture any trailing data from files that are written to very slowly.
You should see if you are hitting the max_fd limit as there will be errors in the splunkd.log
TailingProcessor - File descriptor cache is full (100), trimming..
... View more