When using "Non-Audit" as the data input, I am able to retrieve a few lines, and then the input fails. Other input settings work as intended.
While I was monitoring the output of /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log I noticed the following:
2016-06-27 16:46:37,129 +0000 log_level=INFO, pid=28652, tid=Thread-9, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="fwmgmtp02-nonAudit" connection="fwmgmtp02" data="non_audit"]log_level=2 file:lea_loggrabber.cpp func_name:read_fw1_logfile_collogs code_line_no:2052 :LEA collected logfile handler was invoked
2016-06-27 16:47:02,901 +0000 log_level=ERROR, pid=28652, tid=Thread-1, file=event_writer.py, func_name=_do_write_events, code_line_no=79 | EventWriter encounter exception which maycause data loss, queue leftsize=838
Traceback (most recent call last):
File "/opt/splunk/etc/apps/Splunk_TA_checkpoint-opseclea/bin/splunk_ta_checkpoint_opseclea/splunktalib/event_writer.py", line 62, in _do_write_events
for evt in event:
File "/opt/splunk/etc/apps/Splunk_TA_checkpoint-opseclea/bin/splunk_ta_checkpoint_opseclea/splunktaucclib/data_collection/ta_data_collector.py", line 59, in <genexpr>
index, scu.escape_cdata(event.event)) for event
File "/opt/splunk/etc/apps/Splunk_TA_checkpoint-opseclea/bin/splunk_ta_checkpoint_opseclea/splunktalib/common/util.py", line 71, in escape_cdata
data = data.encode("utf-8", errors="xmlcharrefreplace")
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 572: invalid start byte
2016-06-27 16:47:02,914 +0000 log_level=INFO, pid=28652, tid=Thread-1, file=event_writer.py, func_name=_do_write_events, code_line_no=84 | Event writer stopped, queue leftsize=849
2016-06-27 16:47:02,915 +0000 log_level=INFO, pid=28652, tid=Thread-4, file=ta_data_collector.py, func_name=_write_events, code_line_no=122 | [input_name="fwmgmtp02-nonAudit" data="non_audit"] the event queue is closed and the received data will be discarded
2016-06-27 16:47:02,915 +0000 log_level=INFO, pid=28652, tid=Thread-4, file=ta_data_collector.py, func_name=index_data, code_line_no=114 | [input_name="fwmgmtp02-nonAudit" data="non_audit"] End of indexing data for fwmgmtp02-nonAudit_non_audit
It seems that only the Non-Audit setting is retrieving unexpected Unicode values
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 572: invalid start byte
at which point the process hangs and indexing stops. The opsec_lea connection stays up, however, event processing is halted.
The byte value (0xa0) in this case isn't a specific indicator, I've had the same result for 0xf2 and 0xfc. I noticed that all of these are valid ASCII characters, but invalid standard Unicode characters (for example, 0xfc is in the "Specials" block as an o with an umlaut).
I believe this only occurs with Non-Audit as this is the only input setting that also retrieves Anti-Malware and Anti-Virus events (which I need to index). Although it is most likely a Unicode-handling error, I also found a related issue with fw-loggrabber here:
http://manpages.ubuntu.com/manpages/trusty/man1/fw1_lea2dlf.1.html
In the "Other notes" it points out an "unexpected non-continuation byte" -- the Splunk output is almost exactly the same: "invalid continuation byte".
I tested setting the environment variables in /opt/splunk/etc/splunk-launch.conf however had the same issue.
Does anybody else have an idea to enable Non-Audit and keep it stable?
... View more