Maybe a bit of a challenging question, but how "intelligent" is the Splunk clusters really?
Say you have an Index Cluster with 10* servers already running each with 12 core CPU's, and we need more cores in the duster to deal with the raising demand for ingesting even more events coming in.
All hosts(Linux) are virtual on VMWare.
What will happen to the Index Cluster if we add another 5 Index Servers to the existing cluster — each with less cores (6 each)?
In other words, even though it might not be the most optimal solution, will the Index Cluster still benefit from adding more servers with less cores each (compare to existing)?
If it will benefit, are there any tuning and/or configs that will help the Cluster to perform most optimal with divergent servers in it?
PS: The reason for asking is that, right now, it’s much faster to get new servers with 6 cores.
I'd be most happy to get some input on this subject, and, in general, hear a bit more about how "intelligent" and "flexible" the different Splunk instances are in dealing with divergence in capacity within clusters (indexer and Search Heads).
now that's really difficult to answer without knowing your cluster configuration and your backend at all. This totally depends on different factors. Your cluster will be as intelligent as you configured it to be.
Edit: fixed mistakable bucket explanation
Thanks for your response.
Let me be more precise here: What will happen to our Index Cluster if we add 12 more servers each with 6 cores (In total 72 more cores) to our already existing 16 indexers with each 12 cores (192 cores in total)?
Will we gain 50% more indexing and/or search capacity?
How will the CM handle the divergence in cores/host?
Do we HAVE TO have total equal hosts in the Index cluster (I know it's recommended, but recommended it not always possible to get)?
This IS a Splunk infrastructure challenge, and I'd like/need to know how flexible /smart the system is.
so you already got a pretty big setup. This is not easy to scale anymore and I would suggest contacting Splunk's Professional Service for further assistance. As far as I am aware, it's not a good idea to have several high-end indexers (btw, 12 core indexers are not) and on the other side way smaller ones.
Your question was how you can get more events in per second, which does not necessarily mean deploying more indexers. That's why I suggested to take a look into tuning and the Monitoring Console. Why? Because there are several steps that are taken when an event comes in, which can slow your event processing down. Your events go through multiple queues - parsing, aggregation, typing, indexing. The monitoring console might actually show where you can further debug your cluster. More cores for the existing indexers might even help as they will increase the possible threads on an indexer. Just an example. You get what I mean.
So, no, I would not expect a rise up to 50% of indexing rate. Especially if the same SAN with the same disk controllers behind. I would try to increase the CPU cores and the RAM of your existing indexers first, before bringing new ones into the cluster and look for any blocked queues taking too long to process events.
Because having 12 cores only meets the requirements of a reference host whereas giving all of your indexers more cores and more RAM (depending on your overall load) as well as the mentioned IOPS increase would be my first step to take, along with more parallel pipes if needed. And having around 1500 IOPS per indexer which shouldn't be a problem with an SSD-supported SAN. That's my personal opinion though.
I valid your points, but there are more constraints here - unfortunate.
We're over capacity on our Index cluster, and we can't get more core to the existing servers, that's just the way it is.
So I'm frankly looking for alternatives to how we can add more capacity while keeping the current servers.