You are on page 1of 3

What is Kubernetes?

We are at an interesting phase in software development where the focus has clearly shifted now to
how we can write distributed and scalable applications that can be deployed, run and monitored
effectively.

Docker took the first steps towards addressing this piece of the puzzle by revolutionizing the toolset
that made it easy for teams to ship, build and run software. The fact that we can now package all
application dependencies in a single package, utilize common layers in the stack and truly run your
application anywhere that Docker is supported has taken the pain out of distributing and running your
applications in a reliable way. Docker has enabled many efficiencies in software development that
helps us eliminate the pain that was inherent while the application changed hands either across
teams (dev qa operations) and also across staging environments (dev test stage
production).

But is Docker enough? If you look at current applications that have to be deployed to serve a large
number of users, running your application across a container or two will not be enough. You will need
to run a cluster of containers, spread out across regions and capable of doing load balancing. Those
are just a few of the requirements. Other requirements expected from todays large scale distributed
deployments include:

Managing Application Deployments across Containers


Health Checks across Containers
Monitoring these clusters of containers
Ability to programmatically do all of the above via a toolset/API.
This is where Kubernetes comes in. As per its official page, Kubernetes is an open-source system for
automating deployment, scaling and management of containerized applications.

Image Reference : http://kubernetes.io/

Kubernetes has been built on software that has been developed at Google and which has been
running their workloads for more than a decade. This system at Google named Borg has been the
precursor to Kubernetes today. Since its announcement and release of its code as open-source, a
couple of years ago, Kubernetes has been one of the most popular repositories on Github. The project
has seen significant backing from multiple vendors and one of its most notable features is that it is
able to run across multiple cloud providers and even on-premise infrastructure. It is not currently tied
to any particular vendor.

Is Kubernetes the only container cluster management software out there? Definitely not. Couple of
other competing solutions are present : Docker Swarm and Apache Mesos. However, the current
leader in terms of features and momentum is Kubernetes since Docker Swarm lacks features that
make it a serious contender yet while Apache Mesos does come with its complexity.

Since Kubernetes comes with its own toolset, you can pretty much configure and run the software on
multiple cloud providers, on-premise servers and even your local machine. Google Cloud Platform
also provides the Google Container Engine (GKE) that provides a Container Engine service built of top
of Kubernetes. This makes managing a Kubernetes cluster a breeze.

Kubernetes has definitely gained significant mindshare and is currently being promoted as the go to
infrastructure software for running distributed container workloads. It comes with its own learning
curve since it introduces various concepts like Node, Pod, Replication Controller, Health Checks,
Services, Labels and more. In a future blog post, we will break down these building blocks to
understand how Kubernetes orchestrates the show.

Introduction to Kubernetes Building Blocks


Kubernetes

In an earlier blog post, we covered the basics of Kubernetes and the features around Container Orchestration
that it provides. To summarize, as the number of containers increases, you need an orchestration layer that will
help in auto-scaling, rolling updates, health checks, service endpoints, load balancing and more. And Kubernetes
provides us with those functionalities.
In this post, we are going to look at the basic buildings blocks in Kubernetes. The diagram is taken from the
reference architecture documentation from the official Kubernetes documentation.

Reference Architecture Diagram

Let us break it down into key building blocks that we need to understand.

Cluster and Nodes

In the diagram above, a cluster is a group of nodes, where a node can be a physical machine or virtual machines.
Each of the nodes will having the container runtime (Docker or rkt) and will also be running a kubelet service,
which is an agent that takes in the commands from the Master controller (more on that later) and a Proxy, that is
used to proxy connections to the Pods from another component (Services, that we will see later).

Pod

A Pod is a group of containers that form a logical application. For e.g. If you have a web application that is
running a NodeJS container and also a MySQL container, then both these containers will be located in a single
Pod. A Pod can also share common data volumes and they also share the same networking namespace.
Remember that Pods are ephemeral and they could be brought up and down by the Master Controller.
Kubernetes uses a simple but effective means to identify the Pods via the concepts of Labels (name values).

Master Controller

This is the main controller for your cluster and it takes care of multiple things for you. Think of this as a heart of
your operations that enables all the features for which you want Kubernetes in the first place. Usually there is
one Master Controller as you can see in the diagram and it has multiple components in it like Discovery Service
for the Pods, Replication Controller, Scheduling and an API Manager to take in commands from the command
line utility (kubectl) and communicate to the Nodes.

Replication Controller

One of the features that we talked about using Kubernetes for is the auto scaling (up or down) of Pods. This is
done by the Replication Controller component. All you need to do is specify the number of Pods, the container
images that need to be started on them and rules for launching or bringing down the Pods. The controller will
take care of scheduling the Pods on the Nodes.

Services

If we have multiple Pods that are running, how do we ensure that there is a single endpoint to access them. A
service takes care of that. It provides a unified way to route traffic to a cluster and eventually to a list of Pods.
Keep in mind that the same Labels will be used to identify the Service and the Pods. By using a Service, Pods
can be brought up and down without affecting anything. It is seamless to the client who is using it.

As you can see, Kubernetes may look complex but it has a set of components that have well-defined
functionality. You have the choice of running the Kubernetes installation either locally or even via fully-
managed services available in the cloud. For e.g. Google provides the Google Container Engine (GKE), where
they provide you with a Master Controller that is fully managed by them. All you need to do is define your
Cluster , Nodes and definitions around Pods, Services and Replication Controllers. Fully Managed Container
Orchestration services do make it easier to jumpstart your move to Kubernetes.