I'm trying to nail down the corner cases I need to worry about as a Splunk forwarder installation newbie.
I want to feed it files, and am curious about its log rotation survival strategies.
Normally, when I rotate a log file (on Unix), I have the following actions for foo.log:
Create foo.log.new, and set owner/perms.
Hardlink existing foo.log to foo.log.old
Move foo.log.new onto foo.log
This ensures there is no race condition or data loss for writers. It can however mean two logs being written to in parallel (when there are N>1 appending writers).
Log tailers could keep a file open and safely detect log rotation by stat()ing for the name it was opened by, to see if the filename now had a new inode number, and then planning when to release the old file handle and move to the new (or whether to tail both for a while).
The case I'm looking at now involves the above, and a single writer thankfully, but rotation is followed directly by compression, then movement out of the monitored area, so the last data in the log is at risk if a filehandle isn't kept open.
So, I'm looking for the birds and the bees of how often splunk checks files, and what the sequence of stat/open/close would be, whether it uses inode numbers, or just the open+read=CRC (and if it re-uses that CRC filehandle), and in what order the data is sent (when old and new exist, and new may have data, and old has unread data), and whether they count as a sequential stream under a unique identifier, which I need to order-preserve, or separate streams, in which case Splunk might introduce reordering that wouldn't exist in the files.
I've already read docs.splunk.com/Documentation/Splunk/latest/Data/HowLogFileRotationIsHandled and I have noted timebeforeclose, which (if I use large values) could be extremely useful in dodging race conditions, but I don't know how it behaves when the file is rotated during a timebeforeclose interval, and something starts writing to a new incarnation of a file it believes it already has open, and if it considers order then.
The Splunk keeps information about all the files it monitors via CRC checks in index called fish bucket. If you do log rotate as mentioned in your post, Splunk will handle log monitoring as follows.
1. It completes reading the old file.
2. For new file it checks first N number bytes identified by the CRCSalt.
3. Since it won't have same data it will start indexing data from this file.
We use logrotated of linux for managing firewall logs in our environment and Splunk is correctly indexing data.