Recently updating from 6.5.3 to 6.6.1, I started running into a situation where at least one of my Heavy Forwarders would intermittently stop sending data. The Heavy Forwarder is on a RedHat 6.x system, virtualized, with the latest patches.
After reading some answers, I found the command:
./bin/splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus
Which for at least one of the files output the following:
<s:key name="/var/log/aruba/controller1.log"> <s:dict> <s:key name="file position">22440152</s:key> <s:key name="file size">20726882</s:key> <s:key name="parent">/var/log/aruba/*.log</s:key> <s:key name="percent">108.27</s:key> <s:key name="type">open file</s:key> </s:dict> </s:key>
I figure it being over 100% read is odd. When it got into this state, I checked the metrics in _internal for this monitored file and they had stopped recording. With RedHat, I use 'logrotate', running on an hourly schedule. The change from the file being read to stopping seems to have occurred on the hour. It would stop working for 1-3 hours, then suddenly start reading again all by itself. It's worked before with 6.5.3, but not with 6.6.1. I downgraded last night to 6.5.3 on this one Heavy Forwarder, and it's back to working again, so I've eliminated logrotate and other OS patches.
Can anyone advise on some other things to try with 6.6.1 to see if there's some new setting I should be using?
I think I have the same problem, or at least so similar it might as well be the same.
I have a CentOS 6.x HF that receives data via rsyslog and writes that data to different dated files based on filters. So logs from routers would be saved in a file called yyyy-mm-dd-rtrs.log. After I upgraded on Sept 9, all those feeds stopped (and more annoying, all my alerts for missing data didn't work... argh). All files labeled 2017-09-08-xxx.log are indexed. All files labeled 2017-09-09-xxx.log and later are not.
Please keep us updated on this issue.
We do have a case open with Splunk on this, and it is a noted bug now. My customer was running the case with support, so I do not have the JIRA for it, but they did offer a hotfix, so I think this may be fixed in either 6.6.3 or whenever 6.6.4 comes out.
Sorry for the delay. Splunk's response wasn't great. I basically had to write a "restart" into my log rotation script. It might be fixed with the 7.x series now, but I haven't had the time to go back and check.