Hi,
I am using Splunk v6.1.2 with Developer license (up to 10G/day).
The source is a shared folder, which gets many small text files created all the time in a flat structure.
This source is read in batch mode, which as i understood is like monitor mode, only the files gets deleted after being read, see the stanza:
> [batch://E:\\temp]
> move_policy = sinkhole
> disabled = false
> followTail = 0
> index = tipfx_index
> sourcetype = TipfxTests
Seeing that many files are not getting deleted, i looked at $SPLUNK_HOME\var\log\splunk\splunkd.log, to find the following warnings:
> 08-15-2014 20:41:30.209 +0300 WARN FileClassifierManager - Unable to open 'E:\\temp\Int3Blu-3332246.tip'.
> 08-15-2014 20:41:30.209 +0300 WARN FileClassifierManager - The file 'E:\\temp\\Int3Blu-3332246.tip' is invalid. Reason: cannot_read
> 08-15-2014 20:41:30.209 +0300 INFO TailingProcessor - Ignoring file 'E:\\temp\\Int3Blu-3332246.tip' due to: cannot_read
> 08-15-2014 20:41:31.209 +0300 WARN FileClassifierManager - Unable to open 'E:\\temp\\Int3Blu-3332246.tip'.
> 08-15-2014 20:41:31.209 +0300 WARN FileClassifierManager - The file 'E:\\temp\\Int3Blu-3332246.tip' is invalid. Reason: cannot_read
> 08-15-2014 20:41:31.209 +0300 INFO TailingProcessor - Ignoring file 'E:\\temp\\Int3Blu-3332246.tip' due to: cannot_read
> 08-15-2014 20:41:32.225 +0300 WARN FileClassifierManager - Unable to open 'E:\\temp\\Int3Blu-3332246.tip'.
> 08-15-2014 20:41:32.225 +0300 WARN FileClassifierManager - The file 'E:\\temp\\Int3Blu-3332246.tip' is invalid. Reason: cannot_read
> 08-15-2014 20:41:32.225 +0300 INFO TailingProcessor - Ignoring file 'E:\\temp\\Int3Blu-3332246.tip' due to: cannot_read
I looked at the logs of the creation of those input files, and the warnings above were created just about the time of the creation of the files in the shared folder.
However, if i restart splunkd service, these files are read indexed and purged.
All of the above leads me to think that Splunk tries to read the files while being created, but not yet closed, thus it cannot read them.
Can anyone tell me what am i doing wrong here, or how can this be avoided?
Also, if Splunk did not read and index the files (obviously they were not deleted), why does the splunk.log show warnings instead of an errors?
thanks,
Adi
Hi adishilo,
since you're on Windows, did you try the MonitorNoHandle
stanza in inputs.conf
for this file?
This Windows-only input allows you to read files on Windows systems as Windows writes to them. It is using a kernel-mode filter driver to capture raw data as it gets written to the file. Use this input stanza on files which get locked open for writing.
Note: You can only monitor single files with MonitorNoHandle
. You can not monitor directories. If a file you choose to monitor already exists, Splunk does not index its current contents, only new information that comes into the file as it gets written to.
cheers, MuS
Thanks MuS,
Unfortunately, I need to monitor a whole directory.
In the mean time, I've found a work-around. Every couple of minutes - 'touch' all the files in the directory. It seems to work, since trying that on files which Splunk reported that it cannot read, cause it to read, index and delete them from the directory.
the equivalent for the linux 'touch' in windows is:
copy /b
+,,
thanks,
Adi