We have a few instances hosted in AWS that are extremely underutilized (single digit avg. cpu% for the 3 months.
The AWS compute optimizer has recommended the following changes to the instances
Is there a reference document that helps us identify the number of CPU cores vs. concurrent searches that can be run.
We want to take this back to security folks to see if there is an opportunity to optimize the current underutilized instances (single digit CPU%), and thereby reduce costs.
I'm not aware of such a document, but you can find the number of concurrent searches using the Monitoring Console (MC). The maximum number of concurrent systems is 6+<numCPUs>. That formula can be modified using limits.conf settings, but is good for most environments.
If you see errors in the MC about searches being skipped because the maximum number of concurrent searches has been reached then you are not under-utilizing your server. Try re-distributing your scheduled searches so fewer are running at the same time. After that, if you still see the error then you are over-using the server and need more CPUs (or fewer searches).
If there are times when the server is not running any searches, then you are under-using it at those times. The CPUs need to be available for times when searches run, however.
Perhaps you need lower-powered CPUs rather than fewer CPUs.
Yes, you can. Whether you should or not is a different (and better) question.
CPU utilization is not a good measure of how well Splunk is using a VM. Recall that the number of concurrent searches Splunk can run is based on how many CPUs are available. Reduce that number and you reduce the number of searches you can run.
If your maximum concurrent searches is less than the number of CPUs available then a smaller VM might make sense; otherwise, look for other instance options.