In a Kubernetes environment, you can scale your application up or down with a simple command, a UI, or automatically with autoscalers. However, to scale successfully, you need to know when you’re hitting scaling limits and if/when your scaling efforts are effective. Otherwise, you might continue to inefficiently use resources or hit application performance issues unnecessarily. In this post, we’ll check out Kubernetes Horizontal Pod Autoscaling (HPA), when you might use HPA, caveats you might hit when scaling pods, and how you can use Splunk Observability Cloud to gain insight into your Kubernetes environment to ensure you’re scaling efficiently and effectively.
Autoscaling is an awesome way to increase the capacity of your Kubernetes environment to match application resource demands with minimal manual intervention. With autoscaling, scalable resources automatically increase or decrease with variable demand. This creates a more elastic, more performant, and more efficient (both in terms of application resource consumption and infrastructure costs) Kubernetes environment.
Kubernetes supports both vertical and horizontal scaling. With vertical scaling (up/down), resources like memory and CPU are adjusted in place (think increasing/decreasing memory for an existing workload). Whereas with horizontal scaling (in/out), the number of replicas increases or decreases (think increasing/decreasing the number of workloads). Vertical scaling is great for right-sizing your Kubernetes workflows to ensure they have the resources they need. Horizontal scaling is great for dynamically scaling to meet unexpected bursts or busts in traffic to distribute the load.
Horizontal and vertical autoscaling can be configured at the cluster and/or pod level using Cluster Autoscaling, Vertical Pod Autoscaling, and/or Horizontal Pod Autoscaling. The Horizontal Pod Autoscaler (HPA) is the only autoscaler included by default with Kubernetes, so we’ll keep our focus on HPA for now.
To scale a Kubernetes workload resource like Deployments or StatefulSets based on the current demand of resources, you can manually scale workloads, or you can automatically scale workloads through autoscaling. Scaling up or down automatically to match demand reduces the need for manual intervention and ensures efficient resource use within your Kubernetes infrastructure. If load increases, horizontal scaling will respond by deploying more pods. Conversely, if load decreases, the HorizontalPodAutoscaler will instruct the workload resources to scale down, as long as the number of pods is above the configured minimum.
Automatically scaling pods is a hugely beneficial feature of Kubernetes, but there are some caveats when implementing Horizontal Pod Autoscaling. Here are some things to be aware of:
Now that we know what Horizontal Pod Autoscaling is and some things to be aware of when working with HPA, let’s see it in action.
We have a PHP/Apache Kubernetes deployment under the Apache namespace that is exporting OpenTelemetry data to Splunk Observability Cloud. Our deployment creates a new StatefulSet with a single replica. Let’s jump into the Splunk Observability Cloud Kubernetes Navigator, which we explored in a previous post.
In the Navigator, if we filter down to our cluster and the namespace Apache, we can see that we currently only have one pod in our node:
The pod is receiving some significant load, and for HPA example purposes, we have deliberately limited the resources for each Apache pod. We can see spikes in CPU and memory usage that are leading to insufficient resources:
The lack of required resources is throwing containers into a CrashLoopBackOff. For a minute we’ll have 1 active container:
Then suddenly, that container will crash and we’ll have 0 active containers before it attempts to restart again:
Not only can we see these containers starting and stopping in real-time, but the restarts triggered an AutoDetect detector that would have notified our team of an issue:
The Kubernetes Navigator helped us identify our resource issues and the impacts they’re having on our containers, but now we need to resolve these issues. Let’s now set up Horizontal Pod Autoscaling so our workload will automatically respond to this increased load and scale out by deploying more pods.
First, we’ll create our HPA configuration file under our ~/workshop/k3s/hpa.yaml directory:
The HorizontalPodAutoscaler object specifies the behavior of the autoscaler. You can control resource utilization, set the min/max number of replicas, specify the direction of scaling (up/down), set target resources to scale, etc. We’ll apply the configuration by running kubectl apply -f ~/workshop/k3s/hpa.yaml.
We can see that the autoscaler was created and we can validate Horizontal Pod Autoscaling with the kubectl get hpa -n apache command. Here’s what the response looks like:
Now that HPA is deployed, our php-apache service will autoscale when either the average CPU usage goes above 50% or the average memory usage for the deployment goes above 75% with a minimum of 1 pod and max of 4 pods. In the Kubernetes Navigator nodes view, we can validate that we now have 4 pods to handle the increased load. We’ve added a filter to highlight the 4 pods in the Apache namespace:
Looking at our K8s pods tab, we can see additional pod-level metrics and again verify the number of active pods is now 4.
If we wanted to increase the number of pods to 8, we could simply update our hpa.yaml and specify 8 maxReplicas. Once deployed, we can see we now have 8 active Apache pods:
After configuring our HorizontalPodAutoscaler, we can sit back and watch our container count remain steady as our pods autoscale to handle the increased traffic.
If you’re interested in automatically scaling Kubernetes workloads to match increased load with minimal manual intervention, Horizontal Pod Autoscaling might be for you. Before you get started, watch out for some of those common gotchas we mentioned. To identify pods running heavy on resource utilization and where you might benefit from setting up HPA check out the Splunk Observability Cloud Kubernetes Navigator. Don’t have Splunk Observability Cloud? We got you. Start a Splunk Observability Cloud free trial! Ready to jump into the Kubernetes Navigator? Get started integrating Kubernetes and Splunk Observability Cloud!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.