will I be forced to build a heavy forwarder to get logs from very appliances, rather than send them directly to splunk cloud? Since I can't alter some of the appliances to accept the pem or change the formatting, I assume so. Yet I can't find it explicitly called out in the documentation
As @sobrien noted, a forwarder on a consolidated syslog server would be the way to go. One suggestion as well would be to have a heavy forwarder in the mix too - either as the syslog server or a 'relay' from a syslog server with a universal forwarder on it, forwarding to/through a heavy forwarder. The idea is a heavy forwarder can then allow you to perform all the advanced functionality such as transforms, masking, etc. so that you can limit the amount of stuff you are sending into your indexers to only the stuff that has value for your enterprise (reducing the amount you index against your license), mask security specific fields that might need to be obfuscated, etc. It also would allow a single (or fewer) holes that you have to poke through your outbound firewall(s).
The best way to achieve this is to run up a centralised syslog server that can be used to aggregate the feeds from your various appliances, and to run a Splunk forwarder on this same device. This will allow you to send data to Splunk Cloud in an encrypted format, and also allow you a greater level of fault tolerance. This is not dissimilar to best practices within Splunk Enterprise, as outlined here: https://wiki.splunk.com/Things_I_wish_I_knew_then
Another great article that outlines how to achieve Syslog success with Splunk:
Hope this helps.