Hi. OK, this question is totally theory, but i came in case of pratical issue on such problem. So, let's think i have an App that writes the new log file every 00:00 and writes, I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE.
12/02/2025 00:30:00 log #1
12/02/2025 00:30:01 log #2
12/02/2025 00:30:02 log #3 PREMISE: every day, at 00:00 the log is totally rewritten from 0 byte with same headers. So, Splunk UF the first day should take the first 256 bytes for CRC and should take the log entry since it's new for it and send to indexers. Next day it should block it, thinking it's the same as the day before, since the 256b CRC it's the same, so i should find something like in UF log, File will not be read, is too small to match seekptr checksum [...] And find only the entries of yesterday. Now i force an "initCrcLen = 1024", and now i should start on indexing since the file has, for example, I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE. I'M STARTING TO WRITE.
AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG. AND NOW I START TO LOG.
12/02/2025 00:30:00 log #1
12/02/2025 00:30:01 log #2
12/02/2025 00:30:02 log #3 The next day, initCrcLen to 1024 bytes wil be the same, UF tags the log as already sent yesterday and blocks it again. OK, now i force a new "crcSalt = <SOURCE>" to the input. But in few days both initCrcLen = 1024 [same header size] crcSalt = <SOURCE> [same file path] will calculate a same identical CRC, so the UF will think the log is still the same and it will be blocked! Am i wrong? I know the question is an old and always discussed question. But it's interesting to know if it's possibile to tell UF: "read the file without thinking about it's CRC, if i have still indexed it, read it, do anything other, and send to indexers!" 😉😉😉
... View more