You are on page 1of 14

10/4/2019 How to autoscale apps on Kubernetes with custom metrics

Daniel Weibel

How to autoscale apps on Kubernetes


with custom metrics
PUBLISHED IN OCTOBER 2019

Welcome to Bite-sized Kubernetes learning — a regular column on the


most interesting questions that we see online and during our workshops
https://learnk8s.io/autoscaling-apps-kubernetes/ 1/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

answered by a Kubernetes expert.

Today's answers are curated by Daniel Weibel. Daniel is a software


engineer and instructor at Learnk8s.

If you wish to have your question featured on the next episode, please get in
touch via email or you can tweet us at @learnk8s.

Did you miss the previous episodes? You can find them here.

How do you scale apps on Kubernetes?


Deploying an app to production with a static configuration is not optimal.

Traffic patterns can change quickly, and the app should be able to adapt to
them:

When demand increases, the app should scale up (increasing the number
of replicas) to stay responsive.
When demand decreases, the app should scale down (decreasing the
number of replicas) to not waste resources.

Kubernetes provides excellent support for autoscaling applications in the


form of the Horizontal Pod Autoscaler.

In the following, you will learn how to use it.

Different types of autoscaling

https://learnk8s.io/autoscaling-apps-kubernetes/ 2/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

First of all, to eliminate any misconceptions, let's clarify the use of the term
"autoscaling" in Kubernetes.

In Kubernetes, several things are referred to as "autoscaling", including:

Horizontal Pod Autoscaler: adjusts the number of replicas of an application


Vertical Pod Autoscaler: adjusts the resource requests and limits of a
container
Cluster Autoscaler: adjusts the number of nodes of a cluster

While these components all "autoscale" something, they are completely


unrelated to each other.

They all address very different use cases and use different concepts and
mechanisms.

They are developed in separate projects and can be used independently


from each other.

This article treats the Horizontal Pod Autoscaler.

What is the Horizontal Pod Autoscaler?


The Horizontal Pod Autoscaler is a built-in Kubernetes feature that allows to
horizontally scale applications based on one or more monitored metrics.

Horizontal scaling means increasing and decreasing the number of


replicas. Vertical scaling means increasing and decreasing the
compute resources of a single replica.

https://learnk8s.io/autoscaling-apps-kubernetes/ 3/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

Technically, the Horizontal Pod Autoscaler is a controller in the Kubernetes


controller manager, and it is configured by HorizontalPodAutoscaler
resource objects.

The Horizontal Pod Autoscaler can monitor a metric about an app and
continuously adjust the number of replicas to optimally meet the current
demand.

Resources that can be scaled by the Horizontal Pod Autoscaler


include the Deployment, StatefulSet, ReplicaSet, and
ReplicationController.

To autoscale an app, the Horizontal Pod Autoscaler executes an eternal


control loop:

2. CALCULATE
1. QUERY HORIZONTAL
APP POD
AUTOSCALER

15 SEC

3. SCALE

The steps of this control loop are:

1. Query the scaling metric


2. Calculate the desired number of replicas
3. Scale the app to the desired number of replicas

The default period of the control loop is 15 seconds


https://learnk8s.io/autoscaling-apps-kubernetes/ 4/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

The calculation of the desired number of replicas is based on the scaling


metric and a user-provided target value for this metric.

The goal is to calculate a replica count that brings the metric value as close
as possible to the target value.

For example, imagine that the scaling metric is the per-second request rate
per replica:

If the target value is 10 req/sec and the current value is 20 req/sec, the
Horizontal Pod Autoscaler will scale the app up (i.e. increasing the number
of replicas) to make the metric decrease and get closer to the target value.
If the target value is 10 req/sec and the current value is 2 req/sec, the
Horizontal Pod Autoscaler will scale the app down (i.e. decreasing the
number of replicas) to make the metric increase and get closer to the target
value.

The algorithm for calculating the desired number of replicas is based on the
following formula:

X = N * (c/t)

Where X is the desired number of replicas, N is the current number of


replicas, c is the current value of the metric, and t is the target value.

You can find the details about the algorithm in the documentation.

That's how the Horizontal Pod Autoscaler works, but how do you use it?
https://learnk8s.io/autoscaling-apps-kubernetes/ 5/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

How to configure the Horizontal Pod


Autoscaler?
Configuring the Horizontal Pod Autoscaler to autoscale your app is done by
creating a HorizontalPodAutoscaler resource.

This resource allows you to specify the following parameters:

1. The resource to scale (e.g. a Deployment)


2. The minimum and maximum number of replicas
3. The scaling metric
4. The target value for the scaling metric

As soon as you create this resource, the Horizontal Pod Autoscaler starts
executing the above-mentioned control loop against your app with the
provided parameters.

A concrete HorizontalPodAutoscaler resource looks like that:

hpa.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myhpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 1
maxReplicas: 10
metrics:

https://learnk8s.io/autoscaling-apps-kubernetes/ 6/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

- type: Pods
pods:
metric:
name: myapp_requests_per_second
target:
type: AverageValue
averageValue: 2

There exist different versions of the HorizontalPodAutoscaler resource


that differ in their manifest structure. The above example uses version
v2beta2 , which is the most recent one at the time of this writing.

This resource specifies a Deployment named myapp to be autoscaled


between 1 and 10 replicas based on a metric named
myapp_requests_per_second with a target value of 2.

You can imagine that the myapp_requests_per_second metric represents the


request rate of the individual Pods of this Deployment — so the intention of
this specification is to autoscale the Deployment with the goal of maintaining
a request rate of 2 requests per second for each of the Pods.

So far, this all sounds good and nice — but there's a catch.

Where do the metrics come from?

What is the metrics registry?


The entire autoscaling mechanism is based on metrics that represent the
current load of an application.

https://learnk8s.io/autoscaling-apps-kubernetes/ 7/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

When you define a HorizontalPodAutoscaler resource you have to specify


such a metric.

But how does the Horizontal Pod Autoscaler know how to obtain these
metrics?

It turns out that there's another component in play — the metrics registry.

The Horizontal Pod Autoscaler queries metrics from the metrics registry:

2. CALCULATE
1. QUERY HORIZONTAL
METRICS
APP POD
REGISTRY
AUTOSCALER

3. SCALE

The metrics registry is a central place in the cluster where metrics (of any
kind) are exposed to clients (of any kind).

The Horizontal Pod Autoscaler is one of these clients.

The purpose of the metrics registry is to provide a standard interface for


clients to query metrics from.

The interface of the metrics registry consists of three separate APIs:

The Resource Metrics API


The Custom Metrics API
The External Metrics API

https://learnk8s.io/autoscaling-apps-kubernetes/ 8/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

RESOURCE
METRICS
API

METRICS
METRICS
METRICS
SOURCE
METRICS
SOURCE METRICS CUSTOM
SOURCE METRICS CLIENT
SOURCE REGISTRY API

EXTERNAL
METRICS
API

These APIs are designed to serve different types of metrics:

Resource Metrics API: predefined resource usage metrics (CPU and


memory) of Pods and Nodes
Custom Metrics API: custom metrics associated with a Kubernetes object
External Metrics API: custom metrics not associated with a Kubernetes
object

All of these metric APIs are extension APIs.

That means, they are extensions to the core Kubernetes API that are
accessible through the Kubernetes API server.

What does that mean for you if you want to autoscale an app?

Any metric that you want to use as a scaling metric must be exposed
through one of these three metric APIs.

Because only in that way they are accessible to the Horizontal Pod
Autoscaler.

So, to autoscale an app, your task is now not only to configure the Horizontal
Pod Autoscaler...

https://learnk8s.io/autoscaling-apps-kubernetes/ 9/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

You also have to expose your desired scaling metric through the metric
registry.

How do you expose a metric through a metric API?

By installing and configuring additional components in your cluster.

For each metric API you need a corresponding metric API server and you
need to configure it to expose a specific metric through the metric API.

By default, no metric API servers are installed in Kubernetes, which


means that the metric APIs are not enabled by default.

Furthermore, you need a metrics collector that collects the desired metrics
from the sources (e.g. from the Pods of the target app) and provides them to
the metric API server.

METRIC METRIC API


COLLECTORS SERVERS

RESOURCE
METRICS
CADVISOR METRICS
SERVER
API

METRICS
METRICS
METRICS
SOURCE CUSTOM
METRICS
SOURCE PROMETHEUS
SOURCE PROMETHEUS METRICS CLIENT
SOURCE ADAPTER
API

EXTERNAL
PROMETHEUS
PROMETHEUS METRICS
ADAPTER
API

There are different choices of metric API servers and metric collectors for the
different metrics APIs.

Resource Metrics API:

https://learnk8s.io/autoscaling-apps-kubernetes/ 10/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

The metrics collector is cAdvisor, which runs as part of the kubelet on


every worker node (so it's already installed by default)
The official metric API server for the Resource Metrics API is the Metrics
Server

Custom Metrics API and External Metrics API:

A popular choice for the metrics collector is Prometheus — however, other


metrics systems like Datadog or Google Stackdriver may be used instead
The Prometheus Adapter is a metric API server that integrates with
Prometheus as a metric collector — however, other metric collectors have
their own metric API servers

So, to expose a metric through one of the metric APIs, you have to go
through these steps:

1. Install a metrics collector (e.g. Prometheus) and configure it to collect the


desired metric (e.g. from the Pods of your app)
2. Install a metric API server (e.g. the Prometheus Adapter) and configure it to
expose from the metrics collector through the corresponding metrics API

Note that this applies specifically to the Custom Metrics API and
External Metrics API, which serve custom metrics. The Resource
Metrics API only serves default metrics and can't be configured to
serve custom metrics.

This was a lot of information, so let's put the bits together.

Putting everything together

https://learnk8s.io/autoscaling-apps-kubernetes/ 11/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

Let's go through a full example of configuring an app to be autoscaled by the


Horizontal Pod Autoscaler.

Imagine, you want to autoscale a web app based on the average per-second
request rate of the replicas.

Also, assume that you want to use a Prometheus-based setup for exposing
the request rate metric through the Custom Metrics API.

The request rate is a custom metric associated with a Kubernetes


object (Pods), so it must be exposed through the Custom Metrics API.

Here's a sequence of steps to reach your goal:

1. Instrument your app to expose the total number of received requests as a


Prometheus metric
2. Install Prometheus and configure it to collect this metric from all the Pods of
your app
3. Install the Prometheus Adapter and configure it to turn the metric from
Prometheus into a per-second request rate (using PromQL) and expose
that metric as myapp_requests_per_second through the Custom Metrics API
4. Create a HorizontalPodAutoscaler resource (as shown above) specifying
myapp_requests_per_second as the scaling metric and an appropriate target
value

As soon as the HorizontalPodAutoscaler resource is created, the Horizontal


Pod Autoscaler kicks in and starts autoscaling your app according to your
configuration.

And you can lean back and watch your app adapting to traffic.

This article sets the theoretical framework for autoscaling an application


based on a custom metric.
https://learnk8s.io/autoscaling-apps-kubernetes/ 12/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

In a future article, you will put this knowledge into practice and execute the
above steps with your own app on your own cluster.

From zero to a fully autoscaled application.

Stay tuned!

That's all folks!


If you enjoyed this article, you might find the following articles interesting:

Architecting Kubernetes clusters — choosing a worker node size where you'll learn the pros
and cons of having clusters with large and small instance types for your cluster nodes.

Boosting your kubectl productivity. If you work with Kubernetes, then kubectl is probably one
of your most-used tools. Whenever you spend a lot of time working with a specific tool, it is
worth to get to know it very well and learn how to use it efficiently.

More autoscaling, metrics, etc.


The article is a summary of the first three modules of the autoscaling
course on the Learnk8s Academy. The full course includes a deep dive
into the three different metrics server as well as how to:

expose metrics from your application

install and configure Prometheus to collect metrics

configure the custom and external metrics adapters to serve custom metrics to Kubernetes

tune the Horizontal Pod Autoscaler

Learn more ⇢

https://learnk8s.io/autoscaling-apps-kubernetes/ 13/14
10/4/2019 How to autoscale apps on Kubernetes with custom metrics

COMPANY

Contact us
Team
Careers
Blog
Newsletter

FOLLOW US

Copyright © learnk8s 2017-2019. Made with ❤ in London. View our Terms and Conditions or Privacy Policy. Send us a note to
hello@learnk8s.io

https://learnk8s.io/autoscaling-apps-kubernetes/ 14/14

You might also like