We starting new Splunk implementation in our company, I wanted to understand that what is advantages/disadvantage and cost comparison when we running Splunk in virtual machines and AWS instances for the below requirement.
If you could share your details, that would be great. Since its new setup for me please help me with comparison of both environments.
We have few applications logs, windows/linux logs, etc approximately indexing 100 GB/Day.
Raw performance between similarly spec'd AWS and VMs would be negligible, however for 100GB you could take a look at the reference architecture spec and run it all on a single host if you wanted (not saying you should btw).
From memory you'd be looking at 12 cores and 12gb ram. + disk big enough for your retention needs.
Splunk is efficient with compression, but if your 100gb estimate is right, and you want 30 days retention, you can see how storage is going be your biggest cost to rack up yourself with sufficient iOPS (unless you already have a well provisioned fast storage platform).
Conversely, AWS storage works out pretty reasonable, even for large multi TB SSD volumes, but on demand compute is probably going to cost you more over 5 years than the same resources in your hypervisor.
However. if you reserve your AWS instances, (and without doing any hard calcs to back this next sentence up) I am sure EC2 will work out significantly cheaper than a service you are hosting yourself.
With all of that said: it really comes down to your infrastructure - where are your target machines, on-prem or in AWS?
There are costs over and above just EC2 resulting from using AWS, (data transfer/Dx/VPN/support) etc, etc, but if you have a good handle on use of AWS, I can highly recommend it - I have run multi-site clusters, search heads and thousands of forwarders in AWS and never once found it lacking over on-prem VMs or bare-metal.