All Apps and Add-ons

Intrusion events not being sent to Splunk

Kieffer87
Communicator

I'm running eNcore in our lab environment to replace the Splunk eStreamer Add-On. The connection events are coming through just fine as are Access Control Policy Metadata, however I'm not seeing any Intrusion Events being logged in Splunk.

I'm a bit confused if I need to manually edit the configuration files as the documentation reads as though it should work out of the box. I have skimmed through the estreamer.conf file and core as well as intrustion are both set to true. The old eStreamer Add-On is receiving Intrusion Events so I know the FMC is capable of sending them. I'm running Splunk 6.5.3 and FirePower 6.2.2.

Example of the metadata being logged instead of intrustion event:

rec_type=145 name=DMZ rec_type_desc="Access Control Policy Metadata" rec_type_simple="ACCESS CONTROL POLICY" sensor=zdxidsxxxxx uuid=00000000-0000-0000-0000-xxxxxxxxxxxx

I'm also seeing the following warning logged:

2017-11-13 10:26:06,927 estreamer.metadata.cache WARNING  Metadata key ('uuid') missing on object ({'recordLength': 8, 'checksum': 0, 'blockLength': 8, 'archiveTimestamp': 0, 'blockType': 15, 'recordType': 119}). Ignoring
0 Karma

mmcginnis2
Explorer

I am seeing a very similar issue. Our old environment using the old eStreamer app is working fine. When installing the Encore app in a completely new environment (don't have to worry about old TA messing w/ things) we are seeing a trickle of IPS events (1/20th of what FMC is seeing). After reading the comments above, I disabled the connection events, but after a few hours, I haven't seen any progress. I am not seeing my python process or system resources (core, mem, etc) overtaxed.

Is there a manual configuration setting (outside of the .confs) that needs to be made?

0 Karma

douglashurd
Builder

From an initial read from the experts you may be hitting a performance limitation. The current version generally maxes at 1.33 CPU cores. We're have plans for a more scalable version but I don't have a date yet.

What sort of event rate does your deployment run at for Connection Events? Any ideas?

If it is a performance limit it might explain the time stamp gap you see. The event queue gets pruned and if eNcore falls behind it could see the older events get pruned before transmission and then when it resumes you're getting subsequent events that didn't pruned. We've seen this in some networks with very rates.

0 Karma

Kieffer87
Communicator

I recall reading in the documentation and in the configuration files that eNcore still has to process all events sent to it by FirePOWER but it's only writing the Intrusion events to Splunk since I don't have connection events enabled in the configuration...Is that accurate? If that's true it now makes sense why it may be backlogged as it still has to process the connection events and drop them.

Connection Events: ~500 per second

0 Karma

douglashurd
Builder

On connection events, if you disable them at the FMC's eStreamer configuration page that should prevent them from being sent and free up CPU for eNcore as it won't have to read/write/format those events.

On the rate of the events, it should be possible to support 500 events per second with sufficient resources. The CPU clock speed will be a huge factor. What does this 16 CPU platform have for CPUs? Looks like you have plenty of Disk and RAM.

0 Karma

Kieffer87
Communicator

Disabling the connection events (and anything else aside from Intrusion Events) at the FMC seems to have corrected the Intrusion Event lag to Splunk. Surprisingly there is still a single python process at 100% CPU utilization. I would be really surprised if hardware is the limiting factor, Server has 2x Intel E7-4830 (2.13GHz base and 2.4GHz Boost). We have Splunk heavy forwarders ingesting Checkpoint Firewall OpsecLEA connections at 5k+ EPS running 4 core, 8GB ram virtual machines.

0 Karma

douglashurd
Builder

We've seen a few customers hit some performance limitations where one CPU hits 100 percent and a second at 33%. We know we need to make some enhancements to make it more scalable.

Do you have a second CPU available?

If you shut off Connection Events, do the Intrusion events show up?

0 Karma

Kieffer87
Communicator

System resources should not be an issue, we are using our old ArcSight SIEM box which has 16 physical cores, 32 virtual, 256GB ram and 12TB flash storage. This system is only used as a test box for on boarding Splunk apps and data so it sits relatively unused.

Here's an output from top with connection events off. The top python process always remains at 100%. The second two generally hover around 20% and the last one is generally under 2%. Velocity seems to jump from negative 1 to positive 2 even with minimal events coming in.

54708 splunk    20   0  133784  13784   3908 R 100.0  0.0  89182:56 python
17940 splunk    20   0  287516  12524   2668 S  21.9  0.0   1440:51 python
17939 splunk    20   0  287480  11268   1600 S  18.3  0.0   1045:41 python
17932 splunk    20   0  361280  14928   5032 S   1.7  0.0 141:10.29 python

Intrusion Events are coming in but appear to be quite delayed. In the last hour according to Splunk I have 40 Intrusion Events while FirePOWER FMC shows 64. The latest in Splunk has a timestamp of 11/21/17 07:39:17 while the latest in FirePOWER is 11/21/17 07:46:25. This accounts for 20 of the events. The rest of the missing events are mixed in throughout the past.

For example:
Splunk | FMC (Sorry for the crappy output.)

Splunk _time    signature   FMC Classification
-   -   | 11/21/2017 7:46   SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
-   -   | 11/21/2017 7:45   SERVER-WEBAPP Java XML deserialization remote code execution attempt (1:44315:2)
-   -   | 11/21/2017 7:45   OS-OTHER Bash CGI environment variable injection attempt (1:31978:5)
-   -   | 11/21/2017 7:45   SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
-   -   | 11/21/2017 7:45   SERVER-APACHE Apache Struts remote code execution attempt (1:41922:3)
-   -   | 11/21/2017 7:45   SERVER-WEBAPP Java XML deserialization remote code execution attempt (1:44315:2)
-   -   | 11/21/2017 7:44   OS-OTHER Bash CGI environment variable injection attempt (1:31977:5)
-   -   | 11/21/2017 7:44   SERVER-WEBAPP JBoss JMXInvokerServlet access attempt (1:24343:4)
-   -   | 11/21/2017 7:44   FILE-FLASH Adobe Flash Player MSIMG32.dll dll-load exploit attempt (1:38872:1)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts2 blacklisted method redirect (1:29747:6)
-   -   | 11/21/2017 7:43   SERVER-WEBAPP JBoss web console access attempt (1:24342:3)
-   -   | 11/21/2017 7:43   SERVER-WEBAPP JBoss JMX console access attempt (1:21516:9)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts2 blacklisted method redirect (1:29748:6)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts remote code execution attempt (1:41922:3)
-   -   | 11/21/2017 7:43   SERVER-WEBAPP Java XML deserialization remote code execution attempt (1:44315:2)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts2 blacklisted method redirect (1:29747:6)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts2 blacklisted method redirect (1:29748:6)
-   -   | 11/21/2017 7:43   SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
-   -   | 11/21/2017 7:42   SERVER-WEBAPP JBoss web console access attempt (1:24342:3)
2017-11-21T07:39:17.000-0600    OS-OTHER Bash CGI environment variable injection attempt    | 11/21/2017 7:39   OS-OTHER Bash CGI environment variable injection attempt (1:31978:5)
2017-11-21T07:38:42.000-0600    FILE-FLASH Adobe Flash Player MSIMG32.dll dll-load exploit attempt  | 11/21/2017 7:38   FILE-FLASH Adobe Flash Player MSIMG32.dll dll-load exploit attempt (1:38872:1)
-   -   | 11/21/2017 7:34   OS-OTHER Bash CGI environment variable injection attempt (1:31978:5)
2017-11-21T07:28:13.000-0600    SERVER-WEBAPP JBoss JMX console access attempt  | 11/21/2017 7:28   SERVER-WEBAPP JBoss JMX console access attempt (1:21516:9)
2017-11-21T07:22:42.000-0600    SERVER-APACHE Apache Struts remote code execution attempt   | 11/21/2017 7:22   SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
-   -   11/21/2017 7:21 SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
-   -   11/21/2017 7:21 SERVER-WEBAPP JBoss JMXInvokerServlet access attempt (1:24343:4)
2017-11-21T07:16:34.000-0600    SERVER-WEBAPP Java XML deserialization remote code execution attempt    | 11/21/2017 7:19   SERVER-WEBAPP Java XML deserialization remote code execution attempt (1:44315:2)
-   -   11/21/2017 7:16 SERVER-WEBAPP Java XML deserialization remote code execution attempt (1:44315:2)
2017-11-21T07:16:02.000-0600    SERVER-APACHE Apache Struts remote code execution attempt   | 11/21/2017 7:16   SERVER-APACHE Apache Struts remote code execution attempt (1:39191:2)
0 Karma

douglashurd
Builder

Couple of easy things to check. Sorry if these are super obvious.
- check to make Intrusion Events option in the estreamer config page is toggled on
- make sure you see the events in the FMC UI
- search for rec_type=400 in the Splunk search

The old TA might be requesting an older IDS event type. Not sure this matters.

This has to be something super basic.

Doug

0 Karma

Kieffer87
Communicator

Yes we have intrusion events being sent from our production FMC to our production Splunk instance using the old TA and events are showing up in the FMC UI as well.

We do not have the old TA installed on our test Splunk box so that shouldn't be the issue.

I do have 1 rec_type=400 event from two days ago showing in Splunk but should have many more, we generally have 15-20 Intrusion events an hour minimum. One thing I find interesting is my velocity is generally -.05 even though no events are being sent to Splunk other than status and log events from the python agents. One other thing to note is there is a single Python process that is pegged at 100% CPU. The Splunk box is a very beefy box with all flash storage so I wouldn't expect to have a velocity much lower than 0.

Would the add-on try to pull events from the past from the FMC or will it only pull data going forward?

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...