Getting Data In
Highlighted

Why does Splunk think my file is binary

Champion

Hi,

I'm trying to process a ".log" file on a Windows server, and Splunk keeps ignoring it, stating that it's a binary file.

02-26-2016 09:26:54.574 -0500 WARN  FileClassifierManager - The file 'C:\Temp\w32tmdebug.log' is invalid. Reason: binary
02-26-2016 09:26:54.574 -0500 INFO  TailReader - Ignoring file 'C:\Temp\w32tmdebug.log' due to: binary

I am able to open the file using notepad, so I'm not sure why Splunk thinks it's binary. I also tried adding "NOBINARYCHECK" but that didn't work either. My inputs is below. Any suggestions?

[monitor://C:\Temp\w32tmdebug.log]
disabled = false
followTail = 0
index = main
sourcetype = ntpdebug_log
ignoreOlderThan = 2d
NO_BINARY_CHECK = true
Highlighted

Re: Why does Splunk think my file is binary

Path Finder

The missing backslash in your filepath could cause problems.

0 Karma
Highlighted

Re: Why does Splunk think my file is binary

SplunkTrust
SplunkTrust

What kind of data is in your log file? I know PDF files and such are treated as binary

0 Karma
Highlighted

Re: Why does Splunk think my file is binary

Champion

it's text. I can open and edit the file with notepad. Windows says it's a text file when you look at it in the folder view.

0 Karma
Highlighted

Re: Why does Splunk think my file is binary

SplunkTrust
SplunkTrust

Hi a212830, the NOBINARYCHECK is a props.conf configuration, and so you will want to create a stanza in props like:

[source::C:\Temp\w32tmdebug.log]
NO_BINARY_CHECK = true

However, I'd take a closer look at this log file. The encoding is probably screwy and therefore is throwing off Splunk. Windows logs can be tricky like that.

Please let me know if this helps!

Highlighted

Re: Why does Splunk think my file is binary

SplunkTrust
SplunkTrust

Is your file using something other than the UTF-8 or ASCII character sets? For instance, I had a similar problem with some logs encoded in UTF-16 and had to specify it explicitly.

See this:

http://docs.splunk.com/Documentation/Splunk/6.3.3/data/Configurecharactersetencoding#Comprehensive_l...

Highlighted

Re: Why does Splunk think my file is binary

Ultra Champion

Interestingly, the Out-Of-The-Box props.conf has a number of NOBINARYCHECK = 1 within it, such as -

[WinNetMonMk]
KVMODE = multiWinNetMonMk
NOBINARYCHECK = 1
pulldown_type = 0

0 Karma
Highlighted

Re: Why does Splunk think my file is binary

Ultra Champion
0 Karma
Highlighted

Re: Why does Splunk think my file is binary

Ultra Champion

I love all the answers and ideas posted here. I think I've come across this in the past and the root cause was the same as other folks on this thread have posted.

Here's some details on what I remember I did to determine if encoding was the cause:

  1. Create a copy of the file (so you can muck around with it without impacting the production version)
  2. Create a new monitor stanza, same as the old, for the new copy of the file. Validate that it still shows as "binary" when Splunk goes for it. This is a base case to make sure we're able to reproduce the problem.
  3. Open the copy in Notepad++. There's an Encoding menu item. I forget if you have to select text first so feel free to select all and then check the encoding menu to see what is currently selected.
  4. Try toggling to UTF-8 or ANSI options and restarting Splunk to see if the file gets indexed (no longer recognized as binary)

I hope it turns out to be as simple as this. Crossing my fingers.

0 Karma
Highlighted

Re: Why does Splunk think my file is binary

Ultra Champion

Here's another approach for determining the file character set: http://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Garbledevents

So in this case, FTP the file from Windows to a Unix system to use the file command as a way to determine the character set.

0 Karma