Hey all,
I have a daily .csv log file that gets updated with new info every time another app finishes some jobs. I'm getting some duplicate events indexed by splunk from such files and after reading a lot about how Splunk handles log files, I suspect that the file is updated while splunk is still indexing new lines from it's latest version, for example:
File August-1.csv has the following lines at 10:00:00 A.M:
1
2
3
File August-1.csv has the following lines at 10:00:01 A.M:
1
2
3
4
5
File August-1.csv has the following lines at 10:00:03 A.M:
1
2
3
4
5
6
7
8
If Splunk is indexing the new lines(4, 5) from 10:00:01 A.M file, doesn't finish by 10:00:03 A.M, does it saves the Seek Address at "5" and starts indexing new lines (6, 7, 😎 from 10:00:03 A.M version, or is the current indexing cancelled and indexing of 10:00:03 A.M version starts at last known Seek Address which was "3"?
Thanks for any help!
... View more