We just recently ran into this same problem with rsyslog and our Blue Coat proxies. Here is some info that we found in case it proves helpful for anyone viewing this thread down the road.
EDIT:
We ended up getting to the bottom of this issue. The problem is rsyslog's default behavior when attempting to parse TCP syslogs. Specifically, "Octet Counted Framing" support is enabled by default; which causes typical BlueCoat access logs to be parsed incorrectly. For more info on the octet-counted framing syntax, see RFC6587 https://tools.ietf.org/html/rfc6587#section-3.4.1
TL;DR
In order to get rsyslog to even begin to properly handle BlueCoat access logs, you must disable the "octet counted framing" support on your TCP listener(s):
## rsyslog.conf
# if using the "imtcp" module
$InputTCPSupportOctetCountedFraming off
# if using the "imptcp" module (Plain TCP)
$InputPTCPSupportOctetCountedFraming off
# -- OR -- (new-style RainerScript syntax)
# if using the "imtcp" module
input(type="imtcp" port="<TCP_PORT>" ruleset="<RuleSet_Name>" supportOctetCountedFraming="off")
# if using the "imptcp" module (Plain TCP)
input(type="imptcp" port="<TCP_PORT>" ruleset="<RuleSet_Name>" supportOctetCountedFraming="off")
NOTE: The following answer has been edited to include "octet counted framing" information.
Here are some facts:
Blue Coat simply writes the access log file straight to the TCP socket
Blue Coat config allows "periodic" and "continuous" sending modes (for a syslog destination, be sure to use continuous. (See Symantec support article: https://support.symantec.com/en_US/article.TECH242216.html)
Blue Coat log records are delimited by CARRIAGE RETURN LINE FEED ( \r\n )
SIDENOTE: the carriage return combined with the rsyslog $EscapeControlCharactersOnReceive on option causes every Blue Coat message to end with #015 which is 13 in octal -- ASCII 13 = carriage return.
rsyslog (default config) uses LINE FEED ( \n ) character as message delimiter
rsyslog (default config) tries to parse every line (important to understand later)
When the Blue Coat device sends access logs, the messages look something like:
2018-12-04 12:38:37 96 192.168.1.122 200 TCP_NC_MISS 999 572 GET https www.example.com 443 <... truncated ...>
2018-12-04 12:38:37 19 192.168.1.153 200 TCP_ACCELERATED 39 213 CONNECT tcp www.example.co <... truncated ...>
When you send messages like this to rsyslog, rsyslogd will emit error messages like:
rsyslogd: Framing Error in received TCP message: delimiter is not SP but has ASCII value 45. [v8.24.0-34.el7]
This is because rsyslog is assuming the client is employing "octet-counted framing" because it begins with a numeric value. The warning message is because octet-counted framing syntax requires that the "message size" header be terminated by a SPACE ( ) character, but rsyslog found a dash ( - ) instead:
2018-04
^ here
Unfortunately, rsyslog continues with the assumption that this is an octet-counted message frame, and it reads 2018 bytes of data and uses that as the message. The next message is assumed to start at byte number 2019 and the parser will begain again at that offset into the message stream. Depending on the actual log data, this can cause what appears to be wildly erratic parsing behavior. For example, the log file that rsyslog writes might look something like this:
#Remark: 1234567890 "bluecoat01.mydomain.co" "192.168.1.5" "Main_SYSL"#015
12-04 18:45:08 43 192.168.1.205 304 TCP_HIT 707 1313 GET https www.example.com 443 <... truncated ...>
"Mozi
lla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome <... truncated ...>
This results in the Blue Coat logs getting completely garbled up in a way that they are completely useless to Splunk.
The solution to this problem is to disable Octet Counted Framing support in your TCP listener(s) which need to accept BlueCoat access logs. This can be done by adding one or more of the following config options to your rsyslog.conf file:
## rsyslog.conf
# if using the "imtcp" module
$InputTCPSupportOctetCountedFraming off
# if using the "imptcp" module (Plain TCP)
$InputPTCPSupportOctetCountedFraming off
# -- OR -- (new-style RainerScript syntax)
# if using the "imtcp" module
input(type="imtcp" port="<TCP_PORT>" ruleset="<RuleSet_Name>" supportOctetCountedFraming="off")
# if using the "imptcp" module (Plain TCP)
input(type="imptcp" port="<TCP_PORT>" ruleset="<RuleSet_Name>" supportOctetCountedFraming="off")
Final note: if you actually have syslog clients that are properly employing the "octet-counted framing" syntax, you will need to deploy at least two TCP listeners; one for octet-counted framing, and one without. Otherwise the logs from your "octet-counted framing devices" will get garbled. This could be achieved with a configuration as follows:
## rsyslog-example.conf
# syslog clients that won't break rsyslog's "octet-counted framing"
# implementation may send messages to the default syslog port (514).
input(type="imtcp" port="514" ruleset="MyTCPMessageRuleSet")
# BlueCoat access logs and other clients that might break rsyslog's
# "octet-counted framing" implementation must send messages to a custom TCP
# port (5141).
input(type="imtcp" port="5141" ruleset="MyTCPMessageRuleSet" supportOctetCountedFraming="off")
... View more