Hello,
I am a Splunk Enterprise Certified Admin who has an opportunity to advance to Splunk Architect with someone retiring. I am planning on taking the Splunk Architect courses but would like to set up a homelab to give myself practice and experience as well.
In order to best prepare myself, I’d like to set up a virtual Home Lab with a Splunk distributed search environment, an indexer cluster, and a deployment server to deploy all the apps to the forwarders to. How many total Ubuntu Server VMs in Hyper-V should I spin up? I’m thinking 1 search head, at least 2 indexers (right?), the deployment server, a management node, possibly an HF for practice. So possibly a total of six VMs? Or is that too few….or too many? It depends how many Splunk roles each VM can play, which I’m not entirely certain on. It’s difficult to find this information online.
I’m not planning on ingesting much data, just a few data sources for practice. This is really more of a Proof of Concept and learning opportunity for me from an architecture perspective.
Thanks in advance and I look forward to hearing back!
Hi @cjkoenig ,
I'm a Splunk Architect and I created a Splunk HomeLab for myself.
I used six virtual machines:
The Management System contains three Splunk instances working on different ports:
They are all CentOS systems.
For logs generating I can use my client and all systems send their own logs to Indexers.
Ciao.
Giuseppe
Hi @cjkoenig ,
I'm a Splunk Architect and I created a Splunk HomeLab for myself.
I used six virtual machines:
The Management System contains three Splunk instances working on different ports:
They are all CentOS systems.
For logs generating I can use my client and all systems send their own logs to Indexers.
Ciao.
Giuseppe
Hi @gcusello,
This was tremendously helpful, thank you for the precise answer. Can I lastly ask you how much RAM, CPUs, and storage you provisioned for each VM? I’m a bit limited on resources. Were all six the same amount?
Hi @cjkoenig,
I created a lab, only to test connections and configurations, with fed data to ingest and with only one user, so I gave to each VM 1 CPU and 2 GB RAM, in other words I created this lab in my personal computer using VM-Player.
Obviously I cannot use it for a real work, but only for testing configurations and installation steps.
Obviously, when at the end you'll check the health of your infrastructure you have to not consider the resourse warning that you'll have.
Ciao.
Giuseppe
Thanks very much for your help, this will make setting up my lab environment much easier.
In general I'd advise you to look for Linux-based kvm hypervisor. This way you'll be able to utilize a very neat Linux feature called Kernel Samepage Merging. Since you'd most probably be spinning up VMs from the same OS/version so they'd include the exactly same code, the hypervisor would be able to merge the memory pages from several VMs into single instances effectively reducing your memory footprint several times. If you're not planning on ingesting much data, that's what would really use up quite a significant part of your resources.
I agree with @skramp. Also, the VMs can be much smaller than in Production. Splunk's hardware recommendations don't apply to labs. 🙂
Hi Rich,
Thank you and @skramp for helping out. I was a little shocked when I saw the hardware recommendations Splunk has defined, so that’s a relief.
Would you be able to generally specify what amount of RAM, CPUs, and the amount of storage I should spin up each VM with, by chance? — for a small amount of ingestion functioning only as a POC/enough for the Splunk instances to work decent enough together? Is, for example, 4GB of RAM, 2 CPUs, and 16GB of storage for each VM enough? Or would you recommend different numbers? Does it vary based on if I am spinning up the DS/CM combo or a SH?
Hi
You are the right path with those nodes. I would like to add some more based on Splunk Deployment Practical Lab content and other trainings. LM for license, MC for monitoring and UF for testing separate event routing. Of course you could combine some features together, but it's probably easier to keep those separately if you have enough resources on your virtualisation environment? As other already mentions the size of those nodes is not needed to be based on Splunk reference architecture. I suppose that your examples are enough for most of those nodes. Maybe indexers could need some more disk space to avoid that it is full all time? In real use you shouldn't over book cpu or memory for Splunk nodes, but in lab I supposes that there is no issues with that. I'm not sure if Architect lab/exam is needing SHC on deployment or is individual SH enough?
Updated: I just check what's on the Exam and there is also mentioned SHC. So you will need 3 SHC member nodes and Deployer for those. See: https://www.splunk.com/en_us/resources/splunk-certification-exam-study-guide.html
To setting up LM on your lab you need to get Developer (not dev/test) license which contains all features and support also distributed environment. Actually Developer license is better as you don't need to be a Splunk customer to get it, what is requirements for dev/test license!
r. Ismo
In general, you are right, each role a separate VM. But for your lab and non productive environment of this size you can consolidate deployment server and cluster master.