Hi all,
We're starting to ramp up our usage of Splunk with a lot of extra data, eventually adding Enterprise Security, and people on other teams are starting to get into Splunk, requesting forwarder installation and configurations to pull in data relevant to their work, creating dashboards, etc. We have a ton of real-time searches that are used as alerting for a few different applications and I can only see more in the future.
We have 1 SH (4x CPU, 8 gigs RAM), 2 Indexers, 1 heavy forwarder, and 1 cluster manager, all VMs. IO and search times are all in good ranges and nothing is slow, I am preparing for the future.
We can pump up the specs on the Search Head, or create a new Search Head. How do I decide which will work best? Another SH seems like it might add complexity, but is there a threshold where boosting SH stats will not really help performance?
You can go far, FAR higher before boosting SH specs will stop helping. There's a lot of room to increase before you need to think about adding more machines for load. And, if they're sharing resources, adding more machines may not help as much as it could anyway.
As it is, Splunk's "recommended" specs call for, even as virtual machines, 2x 6 core processors and 12 GB of RAM. I think those are fine minimums specs. You can often get by on less in a very small environment, but I would say that until you hit at least that level - probably double or triple that much RAM - I wouldn't even think of adding additional SHs for load. For isolation? Maybe. For redundancy? Maybe. For load? No.
If/when you do ES, you will add a separate SH dedicated solely for ES. Splunk Professional Services will highly, HIGHLY recommend that and may even require it. ES is very snobby and likes to be isolated and put on its own little island where it won't have conflicts with other things. And it doesn't play well with clustering/pooling on the ES SH side of things. (Indexers - sure, SHs, no).
As an aside: having gone through an ES install/configuration recently myself, I can't say "use PS to do your ES" strongly enough - it would be insanity for any normal admin to try to stand up ES on their own.
Some additional considerations and minor expansion on points made above:
Disk IO on your indexers. What sort of IOPS can the hardware under your indexers sustain? Is it shared and may suffer lower performance at certain times than at others? More IOPS is a happier indexer. RAM on indexers is also more useful than is generally given credit for - doubling RAM (at least in many cases I've seen documented) reduces the IOPS by half or more because of the additional caching, and since RAM is usually cheaper than IOPS.... (SSDs are changing this, though, but RAM is still faster than an SSD).
Given how resource intensive Splunk can be, always keep in mind that virtualization isn't always the best solution. It can work, and can even work fairly well at moderate loads, but it's hard for most virtualization stacks to deliver the performance Splunk can really use when you start really working it hard. It is a lot like SQL in that regards: it takes work to make it work WELL at higher loads/volumes in a virtualized environment. But then again, virtualization has its own benefits: you can increase hardware (within limits) as necessary, can shuffle "disks" around to increase speeds without downtime, have good resource monitoring built in so you can tell when you need to bump CPU or RAM, and so on.
Someone running a larger environment can confirm/deny this, but I think a single SH can be expanded QUITE a ways before you really start showing any drop off in gains. Hundreds of GBs of RAM, multiple dozens of fast cores, etc... I know of people running 96 core, 512 GB RAM SHs.
You can go far, FAR higher before boosting SH specs will stop helping. There's a lot of room to increase before you need to think about adding more machines for load. And, if they're sharing resources, adding more machines may not help as much as it could anyway.
As it is, Splunk's "recommended" specs call for, even as virtual machines, 2x 6 core processors and 12 GB of RAM. I think those are fine minimums specs. You can often get by on less in a very small environment, but I would say that until you hit at least that level - probably double or triple that much RAM - I wouldn't even think of adding additional SHs for load. For isolation? Maybe. For redundancy? Maybe. For load? No.
If/when you do ES, you will add a separate SH dedicated solely for ES. Splunk Professional Services will highly, HIGHLY recommend that and may even require it. ES is very snobby and likes to be isolated and put on its own little island where it won't have conflicts with other things. And it doesn't play well with clustering/pooling on the ES SH side of things. (Indexers - sure, SHs, no).
As an aside: having gone through an ES install/configuration recently myself, I can't say "use PS to do your ES" strongly enough - it would be insanity for any normal admin to try to stand up ES on their own.
Some additional considerations and minor expansion on points made above:
Disk IO on your indexers. What sort of IOPS can the hardware under your indexers sustain? Is it shared and may suffer lower performance at certain times than at others? More IOPS is a happier indexer. RAM on indexers is also more useful than is generally given credit for - doubling RAM (at least in many cases I've seen documented) reduces the IOPS by half or more because of the additional caching, and since RAM is usually cheaper than IOPS.... (SSDs are changing this, though, but RAM is still faster than an SSD).
Given how resource intensive Splunk can be, always keep in mind that virtualization isn't always the best solution. It can work, and can even work fairly well at moderate loads, but it's hard for most virtualization stacks to deliver the performance Splunk can really use when you start really working it hard. It is a lot like SQL in that regards: it takes work to make it work WELL at higher loads/volumes in a virtualized environment. But then again, virtualization has its own benefits: you can increase hardware (within limits) as necessary, can shuffle "disks" around to increase speeds without downtime, have good resource monitoring built in so you can tell when you need to bump CPU or RAM, and so on.
Someone running a larger environment can confirm/deny this, but I think a single SH can be expanded QUITE a ways before you really start showing any drop off in gains. Hundreds of GBs of RAM, multiple dozens of fast cores, etc... I know of people running 96 core, 512 GB RAM SHs.
Indeed, adding resources to the SH would be the first step. When searches do slow down you'll likely benefit more from adding indexers, or at least beefing them up - most searches cause more load on the indexers than on the search head.
Thanks for the advice, everyone. I've bumped it up to 8 cores and 16 gigs ram for now. As our needs grow I'll keep expanding the VM when possible.
We will also set up a similar spec VM for ES once we start going down that path; we are enlisting professional services to set it up so I assume they'll come to us ahead of time with that necessity.