Introduction to Kubernetes Architecture

July 11, 2016 11 min read

Kubernetes

Containerisation has brought a lot of flexibility for developers in terms of managing the deployment of the applications. However, the more granular the application is, the more components it consists of and hence requires some sort of management for those.

One still needs to take care of scheduling the deployment of a certain number of containers to a specific node, managing networking between the containers, following the resource allocation, moving them around as they grow and much more.

Nearly all applications nowadays need to have answers for things like

Replication of components
Auto-scaling
Load balancing
Rolling updates
Logging across components
Monitoring and health checking
Service discovery
Authentication

Google has given a combined solution for that which is Kubernetes, or how it’s shortly called – K8s.

In this article, we will look into the moving parts of Kubernetes – what are the key elements, what are they responsible for and what is the typical usage of them. We will then have them all installed using the docker container provided as a playground by K8s team, and review the components deployed.

Glossary

Before we dive into setting up the components, you should get comfortable with some Kubernetes glossary.

Pod

Kubernetes targets the management of elastic applications that consist of multiple microservices communicating with each other. Often those microservices are tightly coupled forming a group of containers that would typically, in a non-containerized setup run together on one server. This group, the smallest unit that can be scheduled to be deployed through K8s is called a pod.

This group of containers would share storage, Linux namespaces, cgroups, IP addresses. These are co-located, hence share resources and are always scheduled together.

Pods are not intended to live long. They are created, destroyed and re-created on demand, based on the state of the server and the service itself.

Service

As pods have a short lifetime, there is not guarantee about the IP address they are served on. This could make the communication of microservices hard.
Imagine a typical Frontend communication with Backend services.

Hence K8s has introduced the concept of a service, which is an abstraction on top of a number of pods, typically requiring to run a proxy on top, for other services to communicate with it via a Virtual IP address.
This is where you can configure load balancing for your numerous pods and expose them via a service.

Kubernetes components

A K8s setup consists of several parts, some of them optional, some mandatory for the whole system to function.

This is a high-level diagram of the architecture

Let’s have a look into each of the component’s responsibilities.

Master Node

The master node is responsible for the management of Kubernetes cluster. This is the entry point of all administrative tasks. The master node is the one taking care of orchestrating the worker nodes, where the actual services are running.

Let's dive into each of the components of the master node.

API server

The API server is the entry points for all the REST commands used to control the cluster. It processes the REST requests, validates them, and executes the bound business logic. The result state has to be persisted somewhere, and that brings us to the next component of the master node.

etcd storage

etcd is a simple, distributed, consistent key-value store. It’s mainly used for shared configuration and service discovery.
It provides a REST API for CRUD operations as well as an interface to register watchers on specific nodes, which enables a reliable way to notify the rest of the cluster about configuration changes.

An example of data stored by Kubernetes in etcd is jobs being scheduled, created and deployed, pod/service details and state, namespaces and replication information, etc.

scheduler

The deployment of configured pods and services onto the nodes happens thanks to the scheduler component.
The scheduler has the information regarding resources available on the members of the cluster, as well as the ones required for the configured service to run and hence is able to decide where to deploy a specific service.

controller-manager

Optionally you can run different kinds of controllers inside the master node. controller-manager is a daemon embedding those.
A controller uses apiserver to watch the shared state of the cluster and makes corrective changes to the current state to change it to the desired one.
An example of such a controller is the Replication controller, which takes care of the number of pods in the system. The replication factor is configured by the user, and it's the controller’s responsibility to recreate a failed pod or remove an extra-scheduled one.
Other examples of controllers are endpoints controller, namespace controller, and serviceaccounts controller, but we will not dive into details here.

Worker node

The pods are run here, so the worker node contains all the necessary services to manage the networking between the containers, communicate with the master node, and assign resources to the containers scheduled.

Docker

Docker runs on each of the worker nodes, and runs the configured pods. It takes care of downloading the images and starting the containers.

kubelet

kubelet gets the configuration of a pod from the apiserver and ensures that the described containers are up and running. This is the worker service that’s responsible for communicating with the master node.
It also communicates with etcd, to get information about services and write the details about newly created ones.

kube-proxy

kube-proxy acts as a network proxy and a load balancer for a service on a single worker node. It takes care of the network routing for TCP and UDP packets.

kubectl

And the final bit – a command line tool to communicate with the API service and send commands to the master node.

Sample setup

The easiest way to start is to run all the described components inside a docker container.
Kubernetes provides a ready docker instance that would start up other parts of the system, and eventually, the picture of the setup looks like this

Prerequisites

The only thing you need for this setup is a docker daemon. The rest is taken care of the docker image.
This example is for educational purposes. In a normal setup, you would have the master and worker(s) separated and possibly running behind a firewall.

Dockerized deployment

Let’s start by retrieving and setting the latest version of Kubernetes as an environmental variable, so that we can use it in later calls.
Run this command on the host server, where the docker daemon is running.

export K8S_VERSION=$(curl -sS https://storage.googleapis.com/kubernetes-release/release/stable.txt)

This will make a request to the kubernetes stable release file and set the environmental variable K8_VERSION to the latest stable release version

We also assume that the host you are running on has the amd64 architecture(you can refer to other architectures in this discussion).

Exporting the system architecture as a variable on the host server

export ARCH=amd64

Next, we will run the hypercube docker instance, which will itself take care of downloading and starting the rest of the Kubernetes components.

The hypercube container is given special, extended privileges so that it can access resources of the host, as well as run other containers on the host.

That is achieved by, first of all, mapping the /root, /sys and /var/run of the host to the container directories in the read-write mode.

Secondly, the container is run with the --privileged option which grants access to all devices of the host to the container and allows starting new containers on the machine. This flag also requires the /var/lib/docker path to be a volume.

--net=host and --pid=host are allowing access to the network and PID namespace of the host basically allowing docker that we are running to see and control processed on the host and reuse the network configurations.

So, putting that all together, and specifying the gcr.io/google_containers/hyperkube-${ARCH}:${K8S_VERSION} image as well as the properties of the hypercube, we get to run the following command on the host:

docker run -d \
--volume=/:/rootfs:ro \
--volume=/sys:/sys:rw \
--volume=/var/lib/docker/:/var/lib/docker:rw \
--volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
--volume=/var/run:/var/run:rw \
--net=host \
--pid=host \
--name=hyperkube-installer \
--privileged \
gcr.io/google_containers/hyperkube-${ARCH}:${K8S_VERSION} \
/hyperkube kubelet \
--containerized \
--hostname-override=127.0.0.1 \
--api-servers=http://localhost:8080 \
--config=/etc/kubernetes/manifests \
--allow-privileged --v=2

The started container runs the hyperkube binary which itself starts all the components we discussed so far.

This might take a moment, so after a while if you run the docker ps command, you should see an output similar to this:

docker ps
CONTAINER ID        IMAGE                                             COMMAND                  CREATED              STATUS              PORTS               NAMES
f267d9d1a24d        gcr.io/google_containers/hyperkube-amd64:v1.2.4   "/setup-files.sh IP:1"   About a minute ago   Up About a minute                       k8s_setup.eb843218_k8s-master-127.0.0.1_default_721118f359852533089009890ac21208_3765dd28
09fc01b04ba6        gcr.io/google_containers/hyperkube-amd64:v1.2.4   "/hyperkube scheduler"   About a minute ago   Up About a minute                       k8s_scheduler.dbfcc0_k8s-master-127.0.0.1_default_721118f359852533089009890ac21208_28c46205
4ac31a50c2bb        gcr.io/google_containers/hyperkube-amd64:v1.2.4   "/hyperkube apiserver"   About a minute ago   Up About a minute                       k8s_apiserver.1082c1e0_k8s-master-127.0.0.1_default_721118f359852533089009890ac21208_d5d50d58
375857e4dec5        gcr.io/google_containers/hyperkube-amd64:v1.2.4   "/hyperkube proxy --m"   About a minute ago   Up About a minute                       k8s_kube-proxy.a1014855_k8s-proxy-127.0.0.1_default_0cda4a663a246109121ac68b3c7e82b7_890d5a95
8b19a89a2695        gcr.io/google_containers/etcd:2.2.1               "/usr/local/bin/etcd "   About a minute ago   Up About a minute                       k8s_etcd.7e452b0b_k8s-etcd-127.0.0.1_default_1df6a8b4d6e129d5ed8840e370203c11_9e621ad8
3d6c9d9c60cd        gcr.io/google_containers/hyperkube-amd64:v1.2.4   "/hyperkube controlle"   About a minute ago   Up About a minute                       k8s_controller-manager.76914b67_k8s-master-127.0.0.1_default_721118f359852533089009890ac21208_cb0abac9
155351af7913        gcr.io/google_containers/pause:2.0                "/pause"                 About a minute ago   Up About a minute                       k8s_POD.6059dfa2_k8s-master-127.0.0.1_default_721118f359852533089009890ac21208_4365c22c
84b32314d407        gcr.io/google_containers/pause:2.0                "/pause"                 About a minute ago   Up About a minute                       k8s_POD.6059dfa2_k8s-etcd-127.0.0.1_default_1df6a8b4d6e129d5ed8840e370203c11_1003b43b
5e44113ee806        gcr.io/google_containers/pause:2.0                "/pause"                 About a minute ago   Up About a minute                       k8s_POD.6059dfa2_k8s-proxy-127.0.0.1_default_0cda4a663a246109121ac68b3c7e82b7_f37316b9
197cd920afc5        gcr.io/google_containers/hyperkube-amd64:v1.2.4   "/hyperkube kubelet -"   About a minute ago   Up About a minute                       hyperkube-installer

Looking at the names of the docker instances, it’s not hard to guess what each component stands for:

The first container in the list k8s_setup.k8s-master-127.0.0.1, takes care of setting up the master node, what it does, can be drilled down to creating of
the basic auth file for access to the Kubernetes api server
service tokens for accessing the Kubernetes api server
The CA cert and keys for HTTPS access to the Kubernetes api server
You can see in the list, containers for each of the components we discussed – k8s_scheduler, k8s_apiserver, k8s_kube-proxy, k8s_etcd and k8s_controller-manager
The next 3 pod containers are so-call “pause” containers, which are used to setup the networking initially, before launching the real container
And finally the installer container we started to put this all together

Deploying new pod

Now that we have a small Kubernetes setup on our host, we should get the command line tool to deploy our first pod and service.

We will use the installer docker container to demonstrate the kubectl in action.

Enter the installer container using

docker exec -it hyperkube-installer /bin/bash

Export the K8S_VERSION variable again, while being inside the installer container

export K8S_VERSION=$(curl -sS https://storage.googleapis.com/kubernetes-release/release/stable.txt)

The following command will download the correct version of kubectl and place it inside the /usr/bin/kubectl

curl -sSL "http://storage.googleapis.com/kubernetes-release/release/$K8S_VERSION/bin/linux/amd64/kubectl" > /usr/bin/kubectl

Make it executable

chmod +x /usr/bin/kubectl

Now you should be able to run a simple kubectl version command and get a similar output

kubectl version
Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}

To list the nodes

kubectl get nodes

NAME STATUS AGE
127.0.0.1 Ready 1d

And finally, to deploy a sample nginx as a pod

kubectl run nginx --image=nginx --port=80

You can now exit the hyperkube-installer container we are in, using CTRL+D, and wait a bit for the nginx image to get downloaded. If you run the docker ps command on the host again, you will see 2 more containers appearing

docker ps
CONTAINER ID        IMAGE                                             COMMAND                  CREATED              STATUS              PORTS               NAMES
b99ae41911d9        nginx                                             "nginx -g 'daemon off"   31 minutes ago      Up 31 minutes                           k8s_nginx.c8c72836_nginx-198147104-sdqet_default_791e301f-358b-11e6-94ba-080027faa9e4_cde648eb
efff5259a994        gcr.io/google_containers/pause:2.0                "/pause"                 31 minutes ago      Up 31 minutes                           k8s_POD.cf58006d_nginx-198147104-sdqet_default_791e301f-358b-11e6-94ba-080027faa9e4_3e7040c5

As you can see, 2 containers have been created with our kubectl run command – the first one is the actual nginx container created and deployed by Kubernetes. The second one, as explained above, is an infrastructure container – this is the first container that is started for a pod, it sets up the network for the pod, then pauses. All other containers for a specific pod, join the network set up by this container.

In order to access the deployed nginx container, we would need to declare it as a service and obtain a concrete IP address that got assigned to it.

docker exec -it hyperkube-installer /bin/bash

The following command exposes the nginx deployment as a service and maps port 80 of the nginx service to the port 8080 of the host

kubectl expose deployment nginx --port=8080 --target-port=80

Now, to see the list of services, we have we run kubectl get service nginx. The output should be similar to

NAME      CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
nginx     10.0.0.185                 8080/TCP   12s

You can see the list of services, with assigned IP address and exposed ports.
Let’s take the IP address only and set it to a variable to try accessing nginx.

ip=$(kubectl get svc nginx --template=)

Now that the ip variable is pointing to the correct service (in this case that would be 10.0.0.185), we can run the curl for the default nginx page

curl "http://$ip:8080/"

The output should contain the usual nginx welcome page.

Summary

This guide gives you a high-level overview of the architecture and moving parts of a Kubernetes setup. We deployed a mini-kubernetes setup inside a docker container, and deployed our fist pod.

In order to get a deeper understanding, you are highly encouraged to deploy each of the components separately and setup their communication from scratch.