Monitoring Splunk

Splunk 6.6.1 stops monitoring files

craigkleen
Communicator

Recently updating from 6.5.3 to 6.6.1, I started running into a situation where at least one of my Heavy Forwarders would intermittently stop sending data. The Heavy Forwarder is on a RedHat 6.x system, virtualized, with the latest patches.

After reading some answers, I found the command:

./bin/splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus

Which for at least one of the files output the following:

<s:key name="/var/log/aruba/controller1.log">
  <s:dict>
    <s:key name="file position">22440152</s:key>
    <s:key name="file size">20726882</s:key>
    <s:key name="parent">/var/log/aruba/*.log</s:key>
    <s:key name="percent">108.27</s:key>
    <s:key name="type">open file</s:key>
  </s:dict>
</s:key>

I figure it being over 100% read is odd. When it got into this state, I checked the metrics in _internal for this monitored file and they had stopped recording. With RedHat, I use 'logrotate', running on an hourly schedule. The change from the file being read to stopping seems to have occurred on the hour. It would stop working for 1-3 hours, then suddenly start reading again all by itself. It's worked before with 6.5.3, but not with 6.6.1. I downgraded last night to 6.5.3 on this one Heavy Forwarder, and it's back to working again, so I've eliminated logrotate and other OS patches.

Can anyone advise on some other things to try with 6.6.1 to see if there's some new setting I should be using?

davpx
Communicator

You should upgrade. There are known bugs for tailing files in versions this old.

0 Karma

woodcock
Esteemed Legend

What was the final resolution on this from support, @craigkleen?

craigkleen
Communicator

Sorry for the delay. Splunk's response wasn't great. I basically had to write a "restart" into my log rotation script. It might be fixed with the 7.x series now, but I haven't had the time to go back and check.

0 Karma

reswob4
Builder

I think I have the same problem, or at least so similar it might as well be the same.

I have a CentOS 6.x HF that receives data via rsyslog and writes that data to different dated files based on filters. So logs from routers would be saved in a file called yyyy-mm-dd-rtrs.log. After I upgraded on Sept 9, all those feeds stopped (and more annoying, all my alerts for missing data didn't work... argh). All files labeled 2017-09-08-xxx.log are indexed. All files labeled 2017-09-09-xxx.log and later are not.

Please keep us updated on this issue.

delink
Communicator

We do have a case open with Splunk on this, and it is a noted bug now. My customer was running the case with support, so I do not have the JIRA for it, but they did offer a hotfix, so I think this may be fixed in either 6.6.3 or whenever 6.6.4 comes out.

0 Karma

delink
Communicator

Did a case get opened on this? I think I am seeing the same thing, and we can open one also.

craigkleen
Communicator

I have, and submitted a diag. Waiting to hear back.

0 Karma

woodcock
Esteemed Legend

Definitely open a support case and add the bug tag to this question.

craigkleen
Communicator

Will do. Thx

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...