In addition to the earlier recommendation of the Splunk Documentation on the subject (Distributed Deployment Manual, Admin Manual etc) here are some more notes:
You'll want too ensure that, if you are expecting the environment to grow, that you initialize it in a fully distributed manner. That is, you have an index cluster, search head cluster, and potentially Heavy-forwarders depending on your specific needs, and if you have remote sites you want to aggregative traffic at.
The key here is that, as the environment grows, you can continue to scale the infrastructure out "horizontally", adding more systems at each functional level (more indexers to the index cluster, more searchheads to the searchhead cluster).
Concerning hardware specs, VMs are usually alright for everything but the indexers. Splunk is supported on a virtual machine, but I really recommend you make an exception for indexers. You'll want to give the indexers as much memory as you can. This will help with storage, which is key to indexer preformance. That being said, you need fast storage. CPU needs will largely be based on how much data you are processing, and how many searches you tend to run at a time (how many users).
Please let me know if this helps!
I would have two suggestions: