I am getting the below error because of two files has same first two lines including timestamps in the different folder.
ERROR TailReader - File will not be read, seekptr checksum did not match (file=filename.2021-01-19.txt). Last time we saw this initcrc, filename was different. You may wish to use larger initCrcLen for this sourcetype, or a CRC salt on this source. Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.
The monitoring stanza has filename.*.txt.
So if I increase the initcrc or crcSalt, then all the files under the folders will get re-indexed.
Along with crcSalt, I tried to use ignoreOlderThan but still, the old files are getting re-indexed.
Example: ignoreOlderThan=1d, means still yesterday files are getting re-indexed.
Any better solution to prevent this?
I have done few changes to solve the issue.
Even though few lines of data got re-indexed, however, it is around 10 to 20 lines only which were acceptable.
Don't increase initCrcLength, if the files are in different folders you can set crcSalt=<SOURCE> for which the full directory path to the source file is added to the CRC. This ensures that each file being monitored has a unique CRC.
Thanks @manjunathmeti for answering the question.
However, if we put crcSalt=<SOURCE>, then the older files are getting re-indexed because the files are in same folder.
Example:
/app/folderA/locationA/filename_yyyy-mm-dd.txt
/app/folderB/locationB/filename_yyyy-mm-dd.txt
So if we put crcsalt in any of the file, under the location all the files are getting re-indexed. If we put ignoreOlderthan=1d, still the yesterday file is getting re-indexed.
Forwarder reads the file only if system_current_time - file_modification_time > ignoreOlderthan. Check if yesterdays files are still under this window.