Wow, those are really big numbers! I would definitely not recommend setting those so big. It's hard to say exactly what the issue is withougt digging further into logs or pcaps (best to do that with your SE, if necessary), but here are two possibilities:
You are running stream as a modular input via splunk. There is a pretty low performance bottleneck with this architecture, that you would likely run into at 2 Gbps. It will create back-pressure in the event queue, which normally would just cause events to drop with errors. But with these settings you're going to blow out memory, slow everything down, and cause everything to start failing, pretty quick. The only way to make this work is to do more filtering/aggregation at the edge (so far fewer events are going to splunk) OR using an independent agent configuration. The latter we've tested upwards of 10 Gbps, and is absolutely a requirement to scale stream.
Your aggregator isn't sending all the packets necessary. For example, it may be dropping FIN packets causing things to never close out in reassembly. Lowering tcpConnectionTimeout (and corresponding udp timeout values) may help work-around this. Unless you really can't tolerate premature termination of those flows, I'd recommend lower those regardless. Values as low as 10 are perfectly reasonable since this is an inactivity timeout. Another thing I've seen a lot is configs that only forward ingress packets, or only forward egress packets. So the flows sit indefinitely waiting for the other side of the "conversation" to arrive. This is easy to check/diagnose using something like tcpdump.
You are doing a lot of decryption or something that requires a lot of processing early in the pipeline, aand need more processorThreads. Since you have 20, you could try increasing that to see if things improve.
... View more