On an active server, log4j is writing log files that Splunk is monitoring. Log4j is configured to roll over log files once they reach 10Mb is size. Here is the log4j settings in use
log4j.appender.out=org.apache.log4j.RollingFileAppender log4j.appender.out.layout=org.apache.log4j.PatternLayout log4j.appender.out.file=logs/output.log log4j.appender.out.encoding=UTF-8 log4j.appender.out.append=true log4j.appender.out.maxFileSize=10MB log4j.appender.out.maxBackupIndex=10
Splunk is configured to monitor the log files. Here is the stanza being used.
[monitor://D:\Server*\logs\output.log*] _TCP_ROUTING=large_pool index = javaServer sourcetype = javaAppLogs disabled = false
All works as expected, until the server get really busy. Once the log are written out very quickly, for example, filling 10Mb in less than a minute, the logs no longer roll once they get to 10Mb in size. The files have grown up to 7Gb and keep growing until the server load starts to slow down, then log4j does start tolling the files again.
This only happens when Splunk is monitoring the log files.
It would seem that Splunk is holding a read lock on the log file when log4j is trying to roll it, and stops log4j from rolling the file, Log4j is still able to log to the file, just not roll it.
I have tried changing the value of the inputs.conf time_before_close option. The default is 3 seconds and I have set it to 1, but the problem still occurs. Should I set it to 0? I get the impression that could have some bad side effects.
Totally understandable. The appenders I shared in the previous post do have a built in in-memory buffer to deal with such outages however. Analagous to the Universal Forwarder.Something to consider anyway.