Deployment Architecture

Issues handling rolled over gzip archives

markmitrushkin
New Member

Hi,

We are using Splunk forwarder (splunkforwarder-6.6.2-4b804538c686-linux-2.6-x86_64-manifest) to gather logs from Java applications configured to use logback.xml with rolling policy. Sometimes, when the log file is rolled and compressed, Splunk process will go in a loop and take a lot of CPU:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 9064 splunk    20   0  253m 157m  12m S 81.2  4.1 129:18.10 splunkd
 9072 splunk    20   0 70316  11m 1456 S  3.7  0.3   5:32.13 splunkd

/opt/splunkforwarder/var/log/splunk/splunkd.log is full with messages like this:

03-19-2018 16:00:13.237 +0100 INFO  ArchiveProcessor - Handling file=/export/logs/app/app-2018-03-19.5.log.gz
03-19-2018 16:00:13.237 +0100 INFO  ArchiveProcessor - reading path=/export/logs/app/app-2018-03-19.5.log.gz (seek=0 len=6818141)
03-19-2018 16:00:13.283 +0100 WARN  ArchiveProcessor -   Could not access archive=/export/logs/app/app-2018-03-19.5.log.gz, because:retrieve: key=0x963d68b47a9905d1 is already locked with state=0x7f2436c97200
03-19-2018 16:00:13.283 +0100 INFO  ArchiveProcessor - Handling file=/export/logs/app/app-2018-03-19.5.log.gz
03-19-2018 16:00:13.284 +0100 INFO  ArchiveProcessor - reading path=/export/logs/app/app-2018-03-19.5.log.gz (seek=0 len=6818141)
03-19-2018 16:00:13.327 +0100 WARN  ArchiveProcessor -   Could not access archive=/export/logs/app/app-2018-03-19.5.log.gz, because:retrieve: key=0x963d68b47a9905d1 is already locked with state=0x7f2436c97200

I would also spawn a lot of child processes and update "/var/run/splunk/splunkd.pid".

The creation time of that app-2018-03-19.5.log.gz file correlates with the start of CPU spike. Source code and documentation of LOGBack [https://logback.qos.ch/documentation.html] tell us that they don't use temporary files when rolling the logs and performing compression.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Don't use followTail. The Admin manual says "Do not use followTail for rolling log files (log files that get renamed as they age."

Try one of these settings. Notice I changed ignoreOlderThan to allow for outages.

[monitor:///export/logs/app/*.log]
ignoreOlderThan = 3d

or

[monitor:///export/logs/app/]
ignoreOlderThan = 3d
whitelist=^/export/logs/app/app.*.log$
---
If this reply helps you, an upvote would be appreciated.
0 Karma

markmitrushkin
New Member

followTail = 0 did not solve the issue as the archive is still getting locked with state. Restarting splunk service resolves the issue temporary as the lock is being lifted and the same archive get processed with was already indexed as a non-archive, skipping.

The next step should be to exclude *.gz files from processing which might mean that is some scenarios are loosing log entries.

0 Karma

logloganathan
Motivator

Please check the inputs.conf setting in splunk.

Please refer the below document it will help you

http://docs.splunk.com/Documentation/Splunk/6.1/Admin/Inputsconf

hope you will like this answer

0 Karma

markmitrushkin
New Member

Hi, thank you for your answer. Could you be more specific on which section might be causing the issue? The only one that I could think of is followTail = 1. This issue doesn't happen consistently, only is some cases when the logs are being rolled over and compressed.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

What are the forwarder's inputs.conf settings?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

markmitrushkin
New Member
[monitor:///export/logs/app/]
followTail = 1
ignoreOlderThan = 1d
whitelist=^/export/logs/app/app.*$
0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!