I have a log file that is a text file. Splunk does not monitor this file because it finds it as a binary file. The following linux command shows the contrary:
file /usr/local/rex/azkaban/logs/azkaban.log
/usr/local/rex/azkaban/logs/azkaban.log: ASCII text, with very long lines
This is the log splunkd.log is reporting:
10-22-2012 17:53:21.733 +0000 WARN FileClassifierManager - The file '/usr/local/rex/azkaban/logs/azkaban.log' is invalid. Reason: binary
10-22-2012 17:53:21.734 +0000 INFO TailingProcessor - Ignoring file '/usr/local/rex/azkaban/logs/azkaban.log' due to: binary
These are the first 2 lines of the file in question and I do not see any bad encoded ASCII character or any file magic number that may indicate the file is binary.
0000000: 3230 3132 2d31 302d 3234 2030 393a 3536 2012-10-24 09:56
0000010: 3a33 332c 3233 3320 494e 464f 2020 5b54 :33,233 INFO [T
The question remains without answer in this forum.
What are the steps splunk use to identify if the file is binary? Use as example the man pages of "file" unix command. It clearly explains what I am looking for in this question. In this way, I can solve this problem from its root.
Why does splunk report the file in question is a binary file?
How can this problem be solved?
This issue has been addressed previously. Example:
http://splunk-base.splunk.com/answers/7370/splunk-thinks-text-file-is-binary
Thanks,
Lp
... View more