Re: Rolled logs compressed immediately

vbumgarner · ‎05-18-2011

In most cases, each log is rolled to a file in the same directory, or even a nearby directory, either with the same name, or changed to include the date or an index. For instance some.log.2011-05-05 or some.log.1. In the vast majority of cases, Splunk handles this without issue because it uses checksums of the contents of the logs instead of the log names.

I have seen on a few occasions software that compresses the rolled log immediately. This is a problem for two reasons:

If Splunk does not have the file open at the time it is compressed, then anything written to the log after Splunk last closed the file will not be indexed.
If Splunk is configured to index the compressed logs, the data will be indexed twice. Why? Though Splunk will index compressed logs without issue, it has no way to know that it has already seen this log in uncompressed form.

There are three scenarios I can imagine where Splunk will not have the log open:

Splunk has closed the file because it hasn't been written to recently. This is controlled by time_before_close, and by default is 3 seconds.
Splunk has run out of file descriptors, and is waiting for one of the logs it currently has open to be closed.
Splunk is not running.

A few options I see to deal with the issue:

Change the software or configuration to wait a day before compressing a rolled log. This is the usual approach and is the best solution.
Increase the time_before_close to a fairly large number. This is only a remotely good idea if the number of active logs on the system is quite small. This also does not help if Splunk happens to be down during the roll.
Wait for the log to be rolled and only index the compressed log. This is not ideal, as the index will be out of date most of the time, and the indexers will do most of their work all at once.
Write a scripted input that handles the tailing and handles the compressed files as needed. This is an absolute last resort, because it introduces a complicated piece of code that must be maintained by the client.

How have others solved this problem?

Simeon · ‎05-19-2011

So is it re-indexing the file, or is it adding that data?

I would enable debug for the TailingProcessor to see what Splunk thinks. Can also use btprobe if you are familiar with it.

vbumgarner · ‎05-19-2011

The scenario is this:
1. Splunk is tailing a file.
2. Splunk closes the file for whatever reason, whether it hasn't been modified for some time, or Splunk is down, or there are so many files that Splunk is having trouble keeping all the active files open.
3. Entries are written to the file while Splunk has the file closed.
4. Before Splunk reopens the file, the file is rolled and compressed immediately.

The desire is for Splunk to handle this case, which unfortunately isn't that uncommon.

Simeon · ‎05-19-2011

I believe the check is supposed to force skipping of it if the CRC matches. So in theory, this is a good thing in ways. Why is there data being added after compression/roll?

vbumgarner · ‎05-19-2011

It thinks it has already indexed the entire file.

INFO ArchiveProcessor - Archive with path="/tmp/foo/XCA_XCPD-2011-04-11T00_08_081e.gz" was already indexed as a non-archive, skipping.

Simeon · ‎05-18-2011

Best practice is:

1 - Use the latest version of Splunk (4.2.x). Splunk now decompresses the file and then performs the checksum. Thanks to transamrit for this!

2 - As mentioned, delay the compression for a day as most software allows this. You will get a day to fix whatever your problem was (system or Splunk). You will need to blacklist the .gz or only index the .log file.

vbumgarner · ‎05-18-2011

The test was...
1. Index a file.
2. Move the file aside.
3. Add some entries to the end of that file.
4. gzip that file.
5. Put it back.

goelli · ‎12-10-2019

Hi, I have the same problem with rotated logfiles.
I'm using Universal Forwarder in version 6.4.5 to monitor a log file and it's rotated versions. There was a network outage and the UF was not able to send it's data for some time. In the meanwhile the logs were rotated and zipped. The files it never began to read were read fine after the Network problem was resolved - even the zipped ones. But the file it was reading at the beginning of the outage was only unzipped and then commented with "already read, so skipped".

When I manually unpacked the file and put it in place the UF started reading where it stopped because of the outage. So I think UF is skipping the check of the seekCRC at seekAdress as mentioned here:
https://docs.splunk.com/Documentation/Splunk/6.4.5/Data/HowLogFileRotationIsHandled

Does anyone know, if this is resolved in any Version?

Simeon · ‎05-18-2011

CRC is against the beginning and end: http://www.splunk.com/base/Documentation/4.2.1/Data/Howlogfilerotationishandled

Are you checking the same file archived and NOT archived? 4.2.x should have a sense of state for a file that has been archived and matches the crc of another file.

vbumgarner · ‎05-18-2011

It doesn't seem to work for this particular use case. Does it only check the top of the file?

05-18-2011 15:27:34.271 -0500 INFO ArchiveProcessor - Archive with path="/tmp/foo/XCA_XCPD-2011-04-11T00_08_081e.gz" was already indexed as a non-archive, skipping.

My test was to move an indexed log aside, add something to the end, gzip it, and put it back.

Simeon · ‎05-18-2011

It uses the same tailing processor, so yes. T

vbumgarner · ‎05-18-2011

Excellent! Is this supported in the universal forwarder, as well?

Rolled logs compressed immediately

Splunk App Dev Community Updates – What’s New and What’s Next

The Latest Cisco Integrations With Splunk Platform!

Enterprise Security Content Update (ESCU) | New Releases

Are you a member of the Splunk Community?

Rolled logs compressed immediately

Splunk App Dev Community Updates – What’s New and What’s Next

The Latest Cisco Integrations With Splunk Platform!

Enterprise Security Content Update (ESCU) | New Releases