I have seen several questions regarding null (\x00) bytes in data, but none have helped me resolve my issue so far.
I am trying to read a log file from Sophos using Universal Forwarders. I have done the following so far:
Added a new sourcetype in Splunk Web.
props.conf on the indexer:
[my_sourcetype]
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y%m%d %H%M%S
TZ = UTC
pulldown_type = 1
CHARSET = UTF-16LE
Modified inputs.conf on the forwarders:
[monitor://C:\ProgramData\Sophos\Sophos Device Control\logs]
sourcetype=my_sourcetype
Sample data from C:\ProgramData\Sophos\Sophos Device Control\logs\DeviceControl.txt:
20131001 150737 Device control has started on this machine.
20131003 131815 Device control has started on this machine.
When I search sourcetype="my_sourcetype", I see data, but it looks like this:
\x002\x000\x001\x003\x00 \x001\x005\x000\x007\x003\x007....
If I copy that data into Notepad, replace \x00 with nothing, then I see the data that I expect.
Before I left tonight, I noticed that the text file I am reading from is blue in Windows Explorer, which indicates the compression bit is set. Every file in this folder is set this way, and removing compression is not an option.
What do I need to do in order to have Splunk index the data without null values? All other data coming from TA-Windows and other apps is fine and does not show null values.
Update 10/17/13:
Wanted to clarify that this is Splunk 4.3.3 on Windows Server 2008 R2 SP1, with Windows 7 SP1 x64 hosts running the Universal Forwarder. Upgrading Splunk is not an option at this time, but we are pushing to do so in the near future.
/etc/system/local/outputs.conf on the forwarder:
[tcpout]
defaultGroup = 1.2.3.4_9997
[tcpout:1.2.3.4_9997]
server = 1.2.3.4:9997
[tcpout-server://1.2.3.4:9997]
/etc/system/local/inputs.conf on the indexer:
[default]
host = my_hostname
[script://$SPLUNK_HOME\bin\scripts\splunk-admon.path]
disabled = 0
[script://$SPLUNK_HOME\bin\scripts\splunk-perfmon.path]
disabled = 0
.... (two more script stanzas)
[monitor://C:\ProgramData\Sophos\Sophos Device Control\logs]
sourcetype=my_sourcetype
Again, all other data coming from the forwarders looks fine without null bytes. Only the data from Sophos is an issue. I am also noticing entries in Splunk with just a single null character as the data (\x00).
... View more