We were running the Checkpoint add-on v. 2.02 and recently upgraded to 2.1 and we notice that when we have several connections to CLM's and CMA's the CPU load on the HF spikes up. We have 8 connections to CLM's and 8 connections to CMA's and one a CentOS system with 6 CPU's, we average around 88-93% CPU load. This is the case on all 8 of our HF's with this app. If we disable the app, cpu utilization drops to 5-8%. I have tried the no_nagle as well as various options for conn_buf_size. On the older version (2.02) we had to set SPLUNK_REST_STATUS_COMMIT=1000000 for it to collect the logs in a timely manner. Right now, that has been disabled.
Any suggestions or assistance would be greatly appreciated.
just posting an answer to get this out of my filter... Probably because you're asking for a lot of work to be done, and the software is then doing it, and the hardware isn't robust enough to balance that load.