Currently my Splunk server mounts a CIFS partition on a Windows box to read the log files there. I need to now install Light Forwarder on the Windows box and stop using the CIFS mount. I know when I install and run the Light Forwarder the system will go back and read all the log files in the directory I need to monitor.
The issue I have is that the almost all of the log files have currently been read and indexed while the CIFS mount was in place. Running Light Forwarder will re-read over 30gb of logs into my Splunk instance. The owner of the Windows box cannot delete the old logs or move them to another directory without causing issue to the application running on the Windows box - which creates and uses the log files. Eventually the app will delete files older than 30 days.
The naming conventions used for the files follow the YYYYMMDD-XX.log scheme and I could wildcard them, but that would be ugly and require hands-on monitoring of the files and modification to the INPUTS.CONF as the older files age out. Is there a way to tell Light Forwarder to ignore files based on date of the file (ie, do not read files with a datestamp older than a day ago, etc)?
If the files have month, day, year, and timestamp, you should be able to configure a white/black list for them. For example, if you want Splunk to ignore files that contain a specific string you could do something like this:
[monitor:///mnt/logs] blacklist = 2009022file\.txt$
So something like this should work - to get rid of all the August logs and the September logs before today (9/28).
You could configure the followTail=1 parameter, which will only start reading at the end of each file when Splunk first looks at it. You can probably use
oneshot and a little manual work to get any files that are caught in the overlap. (or underlap?)
There is option for ignoring older files..in inputs.conf place any of the below line inside monitor stanza
ignoreOlderThan=1s ignoreOlderThan=1m ignoreOlderThan=1h ignoreOlderThan=1d
1s --> ignores any files older than one sec
1m --> ignores any files older than one minute
1h --> ignores any files older than one hour
1d --> ignores any files older than one day