When monitoring a directory for files (using inputs.conf) is it possible to blacklist or ignore files over a certain size? Say for instance a few files get dropped in that are 100 MB in size or more. Splunk usually errors after processing these anyway. Can I ignore processing these larger files? Thanks in advance.
I have used the following hack to solve this problem:
Create a new directory somewhere else (/destination/path/) and point the Splunk forwarder there. Then setup a cron job that creates selective soft links to files pointing to the real directory (/source/path/) for any file that meets your "keep" criteria, like this:
*/5 * * * * cd /source/file/path/ && /bin/find . -maxdepth 1 -type f -size -100M | /bin/sed "s/^..//" | /usr/bin/xargs -I {} /bin/ln -fs /source/path/{} /destination/path/{}
Don't forget to setup a 2nd cron to delete the broken softlinks (source files have been deleted), too, or you will end up with tens of thousands of files here, too.
Thank you for adding a work around sir. 🙂 I will give it a try, but will still leave this issue open until Splunk adds a supported solution such as a file size parameter in inputs.conf. Thanks again.
I don't think there is a native way to ignore files based on their size. On the other hand, Splunk can monitor files much larger than 100 MBs so could you tell us more about "Splunk usually errors after processing these anyway'??