For scaling out throughput for polling messages from queues (not topics) , then the recommended approach is to scale horizontally by deploying (n) JMS Modular Inputs across (n) Splunk Forwarders (Heavy or Universal) , and forwarding the data into an Indexer cluster.
Check out this preso from slide 20 :
http://www.slideshare.net/damiendallimore/splunk-conf-2014-getting-the-message
Adding more stanzas within a single JMS Modular Input instance will soon hit limits because each of the stanzas is just a thread in the same JVM (addresses your points 5,6 above). So that is why I recommend the multiple JMS Mod Inputs across multiple forwarders.
Furthermore , a single JMS Modular Input instance will likely hit a bottleneck in the STDOUT/STDIN OS Buffer between the Modular Input Process (writing to STD OUT) and the Splunk Forwarder Instance (reading from STD IN) , which may lead to blocking in the JMS Mod Input's queue poller logic.
So ,scale out horizontally 🙂
... View more