Getting Data In
Highlighted

Log rotation for compressed files

Builder

We are using Splunk to monitor server.log file from a JBoss instance that rolls over daily (we use the logrotate utility to gz server.log daily)

The folder looks like this inside :

//var/log

//server.log

//server.log.June-12.gz

//server.log.June-13.gz

//server.log.June-14.gz

//server.log.June-15.gz

//

We use the universal forwarder on this linux box to push data out to the indexer.

Currently: Our configuration in the inputs.conf on the forwarder side looks like this.

[monitor://var/log/jboss_logs/server*]

disabled=0

index=os

sourcetype=serverlog

What this does unfortunately is that it gets the daily server.log (which its supposed to because of the server* wildcard) -- and then, everyday it indexes the uncompressed content of the server.*.gz files that are out there

Based on what is described here - apparently log rotation does not apply to the .gz and .tar file formats because they are treated as new files:

http://docs.splunk.com/Documentation/Splunk/latest/Data/MonitorFilesAndDirectories

Does this mean that we will definitely see duplicates ? Has anybody seen a problem like this previously ?

0 Karma
Highlighted

Re: Log rotation for compressed files

Motivator

You can add whitelists/blacklists to your inputs.conf to filter out unwanted files:

blacklist = \.(gz)$

Should filter out anything in the folder with a .gz extension. (Or you could just whitelist .log files to get the same result. Depends on what else is in there)

http://docs.splunk.com/Documentation/Splunk/4.3.2/Data/Whitelistorblacklistspecificincomingdata

View solution in original post