Splunk affecting log4j rolling when busy

wayn23 · ‎10-10-2017

On an active server, log4j is writing log files that Splunk is monitoring. Log4j is configured to roll over log files once they reach 10Mb is size. Here is the log4j settings in use

log4j.appender.out=org.apache.log4j.RollingFileAppender
log4j.appender.out.layout=org.apache.log4j.PatternLayout
log4j.appender.out.file=logs/output.log
log4j.appender.out.encoding=UTF-8
log4j.appender.out.append=true
log4j.appender.out.maxFileSize=10MB
log4j.appender.out.maxBackupIndex=10

Splunk is configured to monitor the log files. Here is the stanza being used.

[monitor://D:\Server*\logs\output.log*]
_TCP_ROUTING=large_pool
index = javaServer
sourcetype = javaAppLogs
disabled = false

All works as expected, until the server get really busy. Once the log are written out very quickly, for example, filling 10Mb in less than a minute, the logs no longer roll once they get to 10Mb in size. The files have grown up to 7Gb and keep growing until the server load starts to slow down, then log4j does start tolling the files again.

This only happens when Splunk is monitoring the log files.

It would seem that Splunk is holding a read lock on the log file when log4j is trying to roll it, and stops log4j from rolling the file, Log4j is still able to log to the file, just not roll it.

I have tried changing the value of the inputs.conf time_before_close option. The default is 3 seconds and I have set it to 1, but the problem still occurs. Should I set it to 0? I get the impression that could have some bad side effects.

Damien_Dallimor · ‎10-10-2017

Do you need to write to file ? If not , you could remove that step and pump the logs directly to a Splunk TCP port.

https://splunkbase.splunk.com/app/1715/

wayn23 · ‎10-10-2017

Yes, want to write to a file to allow for intermittent outages in Splunk, or network. Kinda like a local persistence queue.

Damien_Dallimor · ‎10-10-2017

Totally understandable. The appenders I shared in the previous post do have a built in in-memory buffer to deal with such outages however. Analagous to the Universal Forwarder.Something to consider anyway.

Splunk affecting log4j rolling when busy

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!