Hello All
I am working with our CheckPoint FW admin to figure out why their tool shows 17 million events for the past 8 hrs, and Splunk is only showing roughly 5500 events. I have looked at the errors and this is the only error I could find.
7/26/16
11:38:46.179 AM
2016-07-26 18:38:46,179 +0000 log_level=ERROR, pid=31312, tid=Thread-1, file=event_writer.py, func_name=_do_write_events, code_line_no=79 | EventWriter encounter exception which maycause data loss, queue leftsize=2
Traceback (most recent call last):
File "/opt/splunk/etc/apps/Splunk_TA_checkpoint-opseclea/bin/splunk_ta_checkpoint_opseclea/splunktalib/event_writer.py", line 63, in _do_write_events
write(evt)
IOError: [Errno 32] Broken pipe
I have all 5 sourcetypes being logged as well, Firewall Events, Firewall Audit, Firewall Non-Audit, Firewall VPN and Firewall SmartDefense. Again a search for errors in the TA for checkpoint only shows this one error. We are using the latest version of Splunk Add-on for Checkpoint LEA.
-ed
I think you're running into the same problem I did before. See this answer:
https://answers.splunk.com/answers/421857/splunk-add-on-for-check-point-opsec-lea-non-audit.html
Although I haven't had any issues with treating the input at latin-1, an alternative edit to line 71 of /opt/splunk/etc/apps/Splunk_TA_checkpoint-opseclea/bin/splunk_ta_checkpoint_opseclea/splunktalib/common/util.py
is changing:
data = data. encode("utf-8", errors="xmlcharrefreplace")
to:
data = data.decode("utf-8").encode("utf-8", errors="xmlcharrefreplace")
Credit for the alternative fix goes to @xchen_splunk
You might also want to check this answer:
https://answers.splunk.com/answers/417709/opsec-lea-app-4-state-of-connection.html
When I ran into the same problem those were the steps I needed to take to get the data collection working again.
I read both articles and performed the work that you stated. I still have no more logs than I did before. Again the CPFW admin states that the CP gui states that in the last 8 hrs they have 17million events and Splunk having all 5 inputs enabled shows only 5800 events for the past 24 hrs.
7/27/16
10:49:13.604 AM
2016-07-27 17:49:13,604 +0000 log_level=DEBUG, pid=86465, tid=Thread-131, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_sd" connection="wvdpclogsvr" data="smartdefense"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104311 parent_pid=86465, Sleeping 1 sec
host = splk-idx-05.wv.mentorg.com source = /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log sourcetype = opseclea:log:modinput
7/27/16
10:49:13.599 AM
2016-07-27 17:49:13,599 +0000 log_level=DEBUG, pid=86465, tid=Thread-129, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_na" connection="wvdpclogsvr" data="non_audit"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104307 parent_pid=86465, Sleeping 1 sec
host = splk-idx-05.wv.mentorg.com source = /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log sourcetype = opseclea:log:modinput
7/27/16
10:49:13.597 AM
2016-07-27 17:49:13,597 +0000 log_level=DEBUG, pid=86465, tid=Thread-127, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_vpn" connection="wvdpclogsvr" data="vpn"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104303 parent_pid=86465, Sleeping 1 sec
host = splk-idx-05.wv.mentorg.com source = /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log sourcetype = opseclea:log:modinput
7/27/16
10:49:13.595 AM
2016-07-27 17:49:13,595 +0000 log_level=DEBUG, pid=86465, tid=Thread-125, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_fwe" connection="wvdpclogsvr" data="fw"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104299 parent_pid=86465, Sleeping 1 sec
host = splk-idx-05.wv.mentorg.com source = /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log sourcetype = opseclea:log:modinput
7/27/16
10:49:13.593 AM
2016-07-27 17:49:13,593 +0000 log_level=DEBUG, pid=86465, tid=Thread-123, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_fwa" connection="wvdpclogsvr" data="audit"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104296 parent_pid=86465, Sleeping 1 sec
host = splk-idx-05.wv.mentorg.com source = /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log sourcetype = opseclea:log:modinput
7/27/16
10:49:12.604 AM
2016-07-27 17:49:12,604 +0000 log_level=DEBUG, pid=86465, tid=Thread-131, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_sd" connection="wvdpclogsvr" data="smartdefense"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104311 parent_pid=86465, Sleeping 1 sec
host = splk-idx-05.wv.mentorg.com source = /opt/splunk/var/log/splunk/splunk_ta_checkpoint-opseclea_modinput.log sourcetype = opseclea:log:modinput
7/27/16
10:49:12.599 AM
2016-07-27 17:49:12,599 +0000 log_level=DEBUG, pid=86465, tid=Thread-129, file=ta_opseclea_data_collector.py, func_name=get_logs, code_line_no=62 | [input_name="internal_na" connection="wvdpclogsvr" data="non_audit"]log_level=3 file:lea_loggrabber.cpp func_name:main code_line_no:1107 :Current pid=104307 parent_pid=86465, Sleeping 1 sec
And this is all that the logs are showing me
If you're trying to pull all the relevant Check Point data, (eg: Audit, Firewall, SmartDefense, VPN, Anti-Bot, etc), then I would recommend you only enable two inputs in the Splunk Add-on:
The reason is that Non-Audit will gather your SmartDefense, Firewall and VPN data (as well as Anti-Bot, Anti-Malware, etc). It's not really intuitive, but it is in the docs under item 6: http://docs.splunk.com/Documentation/AddOns/released/OPSEC-LEA/Setup2#Create_a_new_input
Non-Audit: Collects all event types except audit events
When I was testing the TA I did the same thing, and found that the collection process seemed to hang if I enabled too many inputs at the same time. It seemed like multiple fw-loggrabber processes (>2) overwhelms the log server and the connections stop responding. I had to run tcpdump to notice that the connection was open but no data was flowing.
Your alternative is to only disable the Non-Audit setting, but then you won't get Anti-Malware, Anti-Bot, etc.
Ok I did that as well. I guess I wait and see if the data starts flowing in faster and more of it.
You still may have to bounce your reporting server and restart Splunk on the heavy forwarder. This TA is a bit touchy about the setup, but once working seems to be stable.
I have restarted the heavy forwarder several times but still nothing new or more coming in. I am not sure that I can ask the CPFW admin to restart the management/firewall log server. As we are in end of quarter freeze
We found that a restart of the CP log server was required in this case. You may be stuck until you can restart.