You are on page 1of 2

Software Scalability in Kubernetes: 3 Autoscaling Methods in Kubernetes

➡ Horizontal Pod Autoscaling (HPA)

Horizontal scaling or scaling out means that the number of running pods dynamically increases
or decreases as your application usage changes.

When configuring HPA, make sure that:


- All pods have resource requests and limits configured
- Use custom metrics or observed metrics
- Use HPA together with CA whenever possible

➡ Vertical Pod Autoscaling (VPA)

You can use the VPA when your application serves heavyweight requests requiring higher
resources.

VPA recommends optimized CPU and memory requests/limits values (and automatically updates
them for you so that the cluster resources are efficiently used). So, for example, VPA won't add
more replicas of a Pod, but it increases the memory or CPU limits.

HPA and VPA are incompatible. Please don't use both together for the same set of pods. HPA
uses the resource request and limits to trigger scaling, and in the meantime, VPA modifies those
limits, so it will be a mess unless you configure the HPA to use either custom or external metrics.

➡ Cluster Autoscaling (CA)

While HPA scales the number of Pods, the CA changes the number of nodes. When your cluster
runs low on resources, the CA provision a new computation unit (physical or virtual machine) and
adds it to the cluster; if there are too many empty nodes, the CA will remove them to reduce
costs.

You might also like