Getting Data In

Hidden control characters in logs

giovere
Path Finder

I have sinkhole directory which eats pretty much anything what goes in, but there are bunch of log files which are not indexed nor deleted.

With vim I can see some special characters: ^@ at the end of first two fields and ^Z at the end of file. With :set list option in addition there are displayed only $ (eol) which is perfectly fine.

My question is what is this hidden character (^@) and can it prevent Splunk from indexing?

I've tried cat old.log > new.log, but this does not eliminates those characters, they are not displayed with cat though (unless -v is specified).

1 Solution

giovere
Path Finder
0 Karma

giovere
Path Finder

09-27-2011 20:39:05.104 +0200 ERROR TailingProcessor - File will not be read, is too small to match seekptr checksum (file=/logrepo/not_indexed.txt). Last time we saw this initcrc, filename was different. You may wish to use a CRC salt on this source. Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

0 Karma

bbingham
Builder

What is the error in the splunkd.log that says why it wasn't indexed?

giovere
Path Finder

thanks for replies, I've tried hexdump -C could not notice anything suspicious.
inputs.conf is really simple, just move policy set to sinkhole nothing else. I start to suspect maximum field length restriction. Here is only header, but it is sufficient to reproduce problem "not_indexed.txt" does not get indexed, but "indexed.txt" does. only difference is that indexed.txt has one less underscore character in the first field. http://dl.dropbox.com/u/8430959/indexed.txt http://dl.dropbox.com/u/8430959/not_indexed.txt

0 Karma

Simeon
Splunk Employee
Splunk Employee

It sounds like you have a separate issue of the fact that logs are not being indexed. Supplying your inputs.conf settings as well as a log sample would be helpful in debugging your problem.

0 Karma

Ayn
Legend

Not an answer on why Splunk isn't indexing your log, but you can check what ASCII value the character has by doing cat old.log | hexdump -C.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...