In most cases, each log is rolled to a file in the same directory, or even a nearby directory, either with the same name, or changed to include the date or an index. For instance some.log.2011-05-05 or some.log.1. In the vast majority of cases, Splunk handles this without issue because it uses checksums of the contents of the logs instead of the log names.
I have seen on a few occasions software that compresses the rolled log immediately. This is a problem for two reasons:
There are three scenarios I can imagine where Splunk will not have the log open:
A few options I see to deal with the issue:
How have others solved this problem?
The scenario is this:
1. Splunk is tailing a file.
2. Splunk closes the file for whatever reason, whether it hasn't been modified for some time, or Splunk is down, or there are so many files that Splunk is having trouble keeping all the active files open.
3. Entries are written to the file while Splunk has the file closed.
4. Before Splunk reopens the file, the file is rolled and compressed immediately.
The desire is for Splunk to handle this case, which unfortunately isn't that uncommon.
It thinks it has already indexed the entire file.
INFO ArchiveProcessor - Archive with path="/tmp/foo/XCA_XCPD-2011-04-11T00_08_081e.gz" was already indexed as a non-archive, skipping.
Best practice is:
1 - Use the latest version of Splunk (4.2.x). Splunk now decompresses the file and then performs the checksum. Thanks to transamrit for this!
2 - As mentioned, delay the compression for a day as most software allows this. You will get a day to fix whatever your problem was (system or Splunk). You will need to blacklist the .gz or only index the .log file.
Hi, I have the same problem with rotated logfiles.
I'm using Universal Forwarder in version 6.4.5 to monitor a log file and it's rotated versions. There was a network outage and the UF was not able to send it's data for some time. In the meanwhile the logs were rotated and zipped. The files it never began to read were read fine after the Network problem was resolved - even the zipped ones. But the file it was reading at the beginning of the outage was only unzipped and then commented with "already read, so skipped".
When I manually unpacked the file and put it in place the UF started reading where it stopped because of the outage. So I think UF is skipping the check of the seekCRC at seekAdress as mentioned here:
Does anyone know, if this is resolved in any Version?
CRC is against the beginning and end: http://www.splunk.com/base/Documentation/4.2.1/Data/Howlogfilerotationishandled
Are you checking the same file archived and NOT archived? 4.2.x should have a sense of state for a file that has been archived and matches the crc of another file.
It doesn't seem to work for this particular use case. Does it only check the top of the file?
05-18-2011 15:27:34.271 -0500 INFO ArchiveProcessor - Archive with path="/tmp/foo/XCA_XCPD-2011-04-11T00_08_081e.gz" was already indexed as a non-archive, skipping.
My test was to move an indexed log aside, add something to the end, gzip it, and put it back.