Hi, we have an indexer cluster, to which we index many many small files.
we have about a few hundreds thousand files.
we run a universal forwarder on a strong machine(130GB 24CPU) and have a batch input on local directory.
our problem is as follows:
the data is indexed very slowly, and also the batch input is freaking a little....
it used to write logs about every indexed file("Batch input finished reading file..."), but now it writes a few, than stops, than continue to forward data but doesn't delete the files.
the only log we can see is when we turn on DEBUG level logging.
I have checked the logs and I dont have any blocked queues.
We would really appreciate if anyone would either have a reasonable explanation for the problem i'm having, or if someone will be able to suggest another way of indexing this immense amount of files.
Few hundreds thousands files can be too many for a single Universal Forwarder instance. Can you check what the CPU percentage of the Splunk process on that box?
If your UF is running on Windows the files might have chances to get locked by its associated processes. You may run procmon tool to determine to analyse this.
Also Check the file system permissions to verify that the UF has rights to delete the files