Attempting to Splunk Sharepoint 2010 logs but it's unreadable in the UI
0\x004\x00/\x001\x007\x00/\x002\x000\x001...
Using the file command on linux, it says the file is UTF-16 Little Endian. Trying to set that charset on the sourcetype doesn't have any effect. In fact seems to conflict in the server, as I get messages that monitor detects UTF-8. Looking at a very old wiki page, mentions
splunk cmd classify
But that classifier is wrong, saying it is UTF-8 binary.
Output of classify:
WARN FileClassifierManager - The file 'FSHPTP02-20130408-1404.log' is invalid. Reason: binary
PROPERTIES OF FSHPTP02-20130408-1404.log
PropertiesMap: {
CHARSET -> UTF-8
invalid_cause -> binary
is_valid -> False
sourcetype -> unknown
}
But the linux file command says otherwise:
[mlanghor@mlanghor-wkstn U]$ file FSHPTP02-20130408-1404.log
FSHPTP02-20130408-1404.log: Little-endian UTF-16 Unicode English text, with very long lines, with CRLF line terminators
See http://wiki.splunk.com/Community:WindowsCharacterEncoding, which provides a solution for
- Logs coming in as hex
- Logs not monitored with the messages: "TailReader - Ignoring file '' due to: binary" and "FileClassifierManager - The file '' is invalid. Reason: binary"
Did you try changing the encoding? see: http://docs.splunk.com/Documentation/Splunk/latest/Data/Configurecharactersetencoding
Any resolution on this I have the same issue.