Professional Documents
Culture Documents
Autoscaling Methods in Kubernetes
Autoscaling Methods in Kubernetes
Horizontal scaling or scaling out means that the number of running pods dynamically increases
or decreases as your application usage changes.
You can use the VPA when your application serves heavyweight requests requiring higher
resources.
VPA recommends optimized CPU and memory requests/limits values (and automatically updates
them for you so that the cluster resources are efficiently used). So, for example, VPA won't add
more replicas of a Pod, but it increases the memory or CPU limits.
HPA and VPA are incompatible. Please don't use both together for the same set of pods. HPA
uses the resource request and limits to trigger scaling, and in the meantime, VPA modifies those
limits, so it will be a mess unless you configure the HPA to use either custom or external metrics.
While HPA scales the number of Pods, the CA changes the number of nodes. When your cluster
runs low on resources, the CA provision a new computation unit (physical or virtual machine) and
adds it to the cluster; if there are too many empty nodes, the CA will remove them to reduce
costs.