we have one test case in which we have to monitor one csv file(1K records) for any change. If we add any row or update any thing for nnumber of times then also this file need to be ingested in splunk index. Please help me to find the solution of this test case.
Hello @ajitshukla61116 ,
I know a partial solution for your question - use initCrcLength = 1048576, splunk will calculate CRC sum based on the first 1MB and reindex the whole csv file if anything changed in the first 1048576 bytes of the file.
[monitor:///tmp/file.csv] disabled = false .... initCrcLength = 1048576
here is an excerpt from the documentation:
initCrcLength = <integer> * How much of a file, in bytes, that the input reads before trying to identify whether it is a file that has already been seen. You might want to adjust this if you have many files with common headers (comment headers, long CSV headers, etc) and recurring filenames. * Cannot be less than 256 or more than 1048576. * CAUTION: Improper use of this setting causes data to be re-indexed. You might want to consult with Splunk Support before adjusting this value - the default is fine for most installations. * Default: 256 (bytes).
@PavelP thanks for this solution , One problem with this approach is that splunk will re ingest the complete file then ingest the changed row. Is there any way to stop re-ingesting the complete file?
looking for your comments on this
one of your requirements is "update any thing" - does it also means any line anywhere in the file, also at the beginning of file? If yes, then you have to reindex the whole file
@PavelP yes ,"update any thing" - update any row or even any filed value in the file not only beginning or end of file.
one more query -; is there any way we can delete previous records before re ingesting updated file?because every time when file is updated records are increasing in splunk index which is not looking the optimal solution.
regarding your question "is there any way we can delete previous records before re ingesting updated file": there is no easy way to delete events from splunk once events are indexed.