So Message Tracking and IIS logs are the primary log types. There is also a Powershell script that collects admin logs (if you have those enabled in your environment). Note that other statistics are collected through perfmon and scripted inputs, so your total consumed volume will be higher than that.
Compression ratio depends entirely on your environment - Splunk will typically tell you to expect an average compression ratio of 0.5, but this all depends on your data. In our environment, it's typically about twice as good as that; that is, we see compression ratios from 0.25.