I have a custom streaming command which takes a long time to startup (120 seconds) as it loads a cache. I stream usually about 100'000 events to this command in one invocation. This works fine except since we changed it to use python 3. The same Splunk query which streams about 100'000 events to this custom command results in the command being called several times with junks of 10'000 events which makes it obviously very inefficient.
Its a distributed environment with single search head and a 100 node dual site indexer cluster. The search head find normally about 2'000'000 events which are distributed to 20 indexers, resulting in this 100'000 events per call on the indexer.
Any idea why this junks are spread over several calls to the custom command?