I have sinkhole directory which eats pretty much anything what goes in, but there are bunch of log files which are not indexed nor deleted.
With vim I can see some special characters: ^@ at the end of first two fields and ^Z at the end of file. With :set list option in addition there are displayed only $ (eol) which is perfectly fine.
My question is what is this hidden character (^@) and can it prevent Splunk from indexing?
I've tried cat old.log > new.log, but this does not eliminates those characters, they are not displayed with cat though (unless -v is specified).
09-27-2011 20:39:05.104 +0200 ERROR TailingProcessor - File will not be read, is too small to match seekptr checksum (file=/logrepo/not_indexed.txt). Last time we saw this initcrc, filename was different. You may wish to use a CRC salt on this source. Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.
thanks for replies, I've tried hexdump -C could not notice anything suspicious.
inputs.conf is really simple, just move policy set to sinkhole nothing else. I start to suspect maximum field length restriction. Here is only header, but it is sufficient to reproduce problem "not_indexed.txt" does not get indexed, but "indexed.txt" does. only difference is that indexed.txt has one less underscore character in the first field. http://dl.dropbox.com/u/8430959/indexed.txt http://dl.dropbox.com/u/8430959/not_indexed.txt
It sounds like you have a separate issue of the fact that logs are not being indexed. Supplying your inputs.conf settings as well as a log sample would be helpful in debugging your problem.