Hi All,
I am running into a few errors on my host that is monitoring some logs in RHEL. One of the logs in question could write, fill up, close and rewrite again, all within a second.
A few errors in my splunkd on the host:
05-12-2014 13:25:29.087 -0700 ERROR WatchedFile - Error reading file 'LOG LOCATION': Stale NFS file handle
05-12-2014 13:25:29.087 -0700 ERROR TailingProcessor - error from read call from 'LOG LOCATION'.
05-12-2014 13:26:24.187 -0700 INFO WatchedFile - File too small to check seekcrc, probably truncated. Will re-read entire file='LOG LOCATION'
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I am running crcSalt =
Anyone have any ideas?
Thanks in advance!
It is unlikely the crcSalt
option is going to help you in this case. This sounds like a fairly classic race condition. One of the things splunk does is to stat(2)
a file to see if the modtime / size has changed. If your files are completely changing in a very short period of time, then it could be changed out-from-under splunk between the stat()
call and the open()
call.
It probably won't work, but you can try the time_before_close
option and the always_open_file
options in inputs.conf
. These may help (but most likely will not - race conditions are hard)
It is unlikely the crcSalt
option is going to help you in this case. This sounds like a fairly classic race condition. One of the things splunk does is to stat(2)
a file to see if the modtime / size has changed. If your files are completely changing in a very short period of time, then it could be changed out-from-under splunk between the stat()
call and the open()
call.
It probably won't work, but you can try the time_before_close
option and the always_open_file
options in inputs.conf
. These may help (but most likely will not - race conditions are hard)
Agreed. There's nothing you can do here other than to increase the amount of the time the file sticks around.