Introduction to Kubernetes

Sunday, 28 April 2024

Introduction to Kubernetes

These are custom notes that extend my notes from an Udemy course "Kubernetes for the Absolute Beginners - Hands-on". All course content rights belong to course creators.

Introduction

Web Applications are nowadays developed with containerisation on mind as containers contain everything that is needed to run the application: code, runtime, databases, system libraries, etc.

Kubernetes (k8s) is:

an open-source container orchestration platform designed to automate deploying, scaling, and operating containerized applications. It’s essential for managing large-scale, distributed applications efficiently.
Platform for managing application containers (containerised applications, container-oriented applications, containerized workloads and services) across multiple hosts (one or more host clusters)
The easiest and the most recommended way to manage containers in production
It makes it easy to orchestrate many containers on many hosts, scale them as microservices, and easily deploy, rollouts and rollbacks
a set of APIs that we can use to deploy containers on a set of nodes called a cluster
We can describe a set of applications and how they should interact with each other, and Kubernetes determines how to make that happen
Workload scheduler with focus on containerized applications.
Container orchestration technology - system for automating the operations and management of application containers in complex, multi-container workloads:

Container creation
Container deployment
Rolling deployment *)
Auto-scaling
Load balancing
Container health monitoring
Compute resource management
Volume management
Persistent storage
Networking
High availability by cluster federation

Open-source
Originally designed by Google, based upon their running of containers in production. Now maintained by the Cloud Native Computing Foundation
Supports hosting enhanced and complex applications on various kinds of architectures; it is designed to run anywhere:

on a bare metal
in our data center
on the public cloud - supported on any cloud platform; it is platform-agnostic and integrates with a number of different cloud providers, allowing us to pick the platform that best suits our needs
on the hybrid cloud

2 steps involved in scheduling container on a Kubernetes cluster:

Provisioning somewhere the Kubernetes cluster with all its components
Defining the Kubernetes resources, such as Deployments, Services, etc.

With Kubernetes:

we can decide when our containers should run
increase, or decrease the size of application containers
check the resource consumption of our application deployments

To save time and effort when scaling applications and workloads, Kubernetes can be bootstrapped using:

Amazon Elastic Kubernetes Service (EKS)
Google Kubernetes engine (GKE)

*) Rolling deployment:

A deployment strategy that slowly replaces previous versions of an application with new versions of an application by completely replacing the infrastructure on which the application is running;
It is renowned for its ability to update applications without downtime. Incrementally updating nodes or replicas ensures that the service remains available to users throughout the deployment process)
Rolling deployments use the concept of a window size—this is the number of servers that are updated at any given time. For example, if a Kubernetes cluster is running 10 instances of an application (10 pods), and you want to update two of them at a time, you can perform a rolling deployment with a window size of 2.

Core concepts:

Cluster and its components:

Nodes
Pods
Services
Deployments

Cluster

A Kubernetes cluster is a set of node machines for running containerized applications, managed by the Kubernetes control plane.

Its components are:

Nodes
Pods
Services
Deployments

Let's explain briefly each of them.

Nodes

Definition:

Nodes are the worker machines in a Kubernetes cluster. They can be physical machines or virtual machines.

Types of Nodes:

Master Node(s) (Control Plane): Manages the Kubernetes cluster. It runs the control plane components:

API Server (kube-apiserver):

Responsible for orchestrating all operations within the cluster.
Exposes the Kubernetes API, acting as the entry point for all administrative tasks. It is used by:

external users to perform management operations on the cluster
various controllers to monitor the state of the cluster and make necessary changes as required and by the worker nodes to communicate with the server.

It helps all the components listed below to communicate with each other and manages them at high level.

etcd: A distributed key-value store used to store all cluster data e.g. which containers are on which nodes, when were they loaded etc...
Scheduler (kube-scheduler): Assigns work (pods) to the nodes in the cluster based on resource availability. It identifies which container needs to go onto which node, depending on the assigned characteristics of the node. It identifies the right node to place a container on based on the containers resource requirements, the worker nodes capacity or any other policies or constraints, such as taints and tolerations or node affinity rules that are on them.
Controller Manager: Runs various controllers that regulate the state of the cluster

Node Controller - takes care of onboarding new nodes in the cluster, handling situations when nodes become unavailable or get destroyed
Replication Controller ensures that the desired number of containers are running at all times in a replication group

Worker Node: Runs application workloads.

Kubelet: An agent that runs on each node in a cluster and ensures containers are running in a pod. It listens for instructions from the Kube API server and deploys or destroys containers on the nodes as required. The Kube API server periodically fetches status reports from the kubelet to monitor the status of nodes and containers on them.
Kube-proxy: Manages network rules and facilitates communication between pods and services. Maintains network rules on nodes. It handles network communication to your pods from network sessions inside or outside of the cluster. Applications running on the worker nodes need to be able to communicate with each other. For example, you might have a web server running in one container on one of the nodes and a database server running on another container on another node. How would the web server reach the database server on the other node? Communication between worker nodes are enabled by another component that runs on the worker node known as the Kube-proxy service. The Kube-proxy service ensures that the necessary rules are in place on the worker nodes to allow the containers running on them to reach each other.
Container Runtime Engine: The software responsible for running containers (e.g., Docker, containerd). The software that runs containers, such as Docker, containerd, or CRI-O. Our applications are in the form of containers. The different components that form the entire management system on the master node could be hosted in the form of containers. The DNS service networking solution can all be deployed in the form of containers. So we need these software that can run containers and that's the container runtime engine, a popular one being Docker. So we need Docker, or it's supported equivalent installed on all the nodes in the cluster, including the master nodes, if you wish to host the controlling components as containers. It doesn't always have to be Docker. Kubernetes supports other runtime engines as well like ContainerD, a Rocket.

Pods

Definition:

A pod is the smallest and simplest Kubernetes object. It represents a single instance of a running process in your cluster.
The smallest deployable units in Kubernetes, encapsulating one or more containers that share the same storage and network resources.

Components:

Each pod contains one or more containers (usually Docker containers) that are tightly coupled and share the same network namespace and storage volumes.

Lifecycle:

Pods are ephemeral. They are created, assigned a unique IP address, and scheduled on a node. If a pod dies, it won’t be resurrected; instead, a new pod will be created.

Use Case:

Pods are used to run a single instance of an application or a part of an application. For example, a web server container and a helper container that pulls data from a database can run in the same pod.

Deploying containers on nodes by using a wrapper around one or more containers is what defines a pod. A pod is the smallest unit in Kubernetes that you can create or deploy. It represents a running process on your cluster as either a component of your application or an entire app. Generally, you only have one container per pod, but if you have multiple containers with a hard dependency, you can package them into a single pod and share networking and storage resources between them. The pod provides a unique network IP and set of ports for your containers, and configurable options that govern how your containers should run.

One way to run a container in a pod in Kubernetes is to use the kubectl run command, which starts a deployment with a container running inside a pod.

Services

Definition:

A service is an abstraction that defines a logical set of pods and a policy by which to access them, such as a microservice within an application.
Abstract a logical set of pods and provide a stable endpoint for accessing these pods, facilitating load balancing and service discovery.

Components:

ClusterIP: Exposes the service on an internal IP in the cluster. This type makes the service only reachable from within the cluster.
NodePort: Exposes the service on the same port of each selected node in the cluster using NAT.
LoadBalancer: Exposes the service externally using a cloud provider’s load balancer.
Ingress:

Use Case:

Services provide a stable endpoint (IP and DNS name) for clients to access a set of pods. They handle load balancing and service discovery.

Kubernetes creates a service with a fixed IP address for our pods. And a controller says, I need to attach an external load balancer with a public IP address to that service so others outside the cluster can access it.

In GKE, the load balancer is created as a network load balancer. Any client that reaches that IP address will be routed to a pod behind the service.

A service is an abstraction which defines a logical set of pods and a policy by which to access them. As deployments create and destroy pods, pods will be assigned their own IP addresses, but those addresses don't remain stable over time.

A service group is a set of pods and provides a stable endpoint or fixed IP address for them. For example, if you create two sets of pods called frontend and backend, and put them behind their own services, the backend pods might change, but frontend pods are not aware of this. They simply refer to the backend service.

To scale a deployment run the kubectl scale command. In this example, three pods are created in your deployment, and they're placed behind the service and share one fixed IP address. You could also use autoscaling with other kinds of parameters. For example, you can specify that the number of pods should increase when CPU utilization reaches a certain limit.

So far, we've seen how to run imperative commands like expose and scale. This works well to learn and test Kubernetes step by step. But the real strength of Kubernetes comes when you work in a declarative way.

Instead of issuing commands, you provide a configuration file that tells Kubernetes what you want your desired state to look like, and Kubernetes determines how to do it. You accomplish this by using a deployment config file. You can check your deployment to make sure the proper number of replicas is running, by using either kubectl get deployments or kubectl describe deployments. To run five replicas instead of three, al you do is update the deployment config file and run the kubectl apply command to use the updated config file.

You can still reach your endpoint as before by using kubectl get services to get the external IP of the service and reach the public IP address from a client.

The last question is, what happens when you want to update a new version of your app? Well, you want to update your container to get new code in front of users, but rolling out all those changes at one time would be risky. So in this case, you would use kubectl rollout or change your deployment configuration file and then apply the change using kubectl apply. New pods will then be created according to your new update strategy. Here's an example configuration that will create new version pods individually and wait for a new pod to be available before destroying one of the old pods.

Deployments

Definition:

A deployment provides declarative updates to applications. It describes an application’s lifecycle, such as which images to use for the app, the number of pod replicas, and how to update them.
Declarative updates to applications, managing the lifecycle of pods and ensuring the desired state is maintained.

Components:

ReplicaSet: Ensures a specified number of pod replicas are running at any given time. Deployments manage ReplicaSets.
Rolling Updates: Gradually replace instances of the old version of an application with a new version.
Rollback: Revert back to a previous version of the deployment if the current version is not stable.

Use Case: Deployments manage stateless applications. They ensure the desired state of an application is always maintained, handle scaling, and provide self-healing capabilities.

A deployment represents a group of replicas of the same pod and keeps your pods running even when the nodes they run on fail. A deployment could represent a component of an application or even an entire app.

To see a list of the running pods in your project, run the command, kubectl get pods.

Architecture Overview

The Kubernetes architecture is designed for flexibility, scalability, and robustness:

Master Node (Control Plane)

Uses a set of components known as Control Plane components to manage the entire cluster:

Schedules workloads - plans which containers go onto which nodes
Monitors nodes
Maintains the desired state of the cluster

Worker Nodes:

Host applications as containers
Run the application workloads, managing the lifecycle of pods based on the instructions from the control plane.

How a Kubernetes Cluster Works

User Interaction:

Users interact with the cluster via the Kubernetes API, typically using kubectl or other client tools.

API Server:

The API server processes requests, validates them, and updates the cluster state in etcd.

Scheduler:

The scheduler watches for newly created pods that do not have an assigned node, selecting nodes for them based on resource requirements.

Controller Manager:

Controllers constantly monitor the cluster state and make adjustments to ensure the desired state matches the current state.

Kubelet:

The kubelet on each worker node ensures the containers described in the pod specs are running and healthy.

Kube-proxy:

Manages network routing and load balancing to ensure seamless communication between pods and services.

Benefits of a Kubernetes Cluster

Scalability: Automatically scales applications up and down based on demand.
High Availability: Ensures application availability even if some nodes fail.
Resource Efficiency: Optimizes resource usage and balances workloads across nodes.
Flexibility: Supports a wide range of workloads, from microservices to batch processing.
Portability: Runs on various environments, including on-premises, cloud, and hybrid setups.

Container Orchestration

To revise our knowledge on containers, let's read Introduction to Containers.

Container orchestration is the automated management and coordination of containerized applications and services. It involves deploying, managing, scaling, networking, and maintaining containers in a systematic way.

Containers are lightweight, portable, and self-sufficient units that include everything needed to run a piece of software, including the code, runtime, libraries, and dependencies.

Applications run in their own containers.

What if one application depends on another e.g. web server, running in one container, depends on the DB running in another container?

What if the number of users increases and we need to scale out our application?

How to scale down when the load decreases?
How to build services across multiple machines without dealing with cumbersome network and storage settings?

How to manage and roll out our microservices by different service cycle?

We should have an underlying platform that takes care of these dependencies and scaling. This process of deploying and managing containers is called container orchestration.

Container orchestration technologies:

Docker Swarm

easy to set up
lacks advanced features

Kubernetes (Google)

most popular
difficult to set up
has lots of options to support deployments of complex architecture setups
supported on all main public cloud service providers like GCP, Azure, AWS

Mesos (Apache)

difficult to set up
has advanced features

Kubernetes advantages:

Used to deploy and manage hundreds or thousands of containers in a clustered environment
Kubernetes is designed with high availability (HA). We have multiple instances of our application running on different nodes so hardware failures on some nodes won't impact the availability. We are able to create multiple master nodes from preventing single point of failure.
Traffic is load balanced across multiple containers.
Scaling is done by scaling the number of containers running on a single host but also increasing the number of hosts (hardware scaling) if processing demands reach maximum thresholds on existing nodes.
The lifetime of containers might be short. They may be killed or stopped anytime when they exceed the limit of resource, how do we ensure our services always serve a certain number of containers? ReplicationController or ReplicaSet in Kubernetes will ensure a certain number of group of containers are up.
Kubernetes even supports liveness probe to help you define your application health.
For better resource management, we can also define the maximum capacity on Kubernetes nodes and the resource limit for each group of containers (a.k.a pod). Kubernetes scheduler will then select a node that fulfills the resource criteria to run the containers.
Kubernetes provides an optional horizontal pod auto-scaling feature. With this feature, we could scale a pod horizontally by resource or custom metrics.
Perfect match for microservices where it helps their CD (Continuous Delivery). We can create a Deployment to rollout, rollover, or roll back selected containers.
Containers are considered as ephemeral - they can quickly and/or often die. We can mount the volume into a container to preserve the data in a single host world. In the cluster world, a container might be scheduled to run on any host. Kubernetes Volumes and Persistent Volumes make the volume mounting work as permanent storage seamlessly.
This is all achieved with the set of declarative object configuration files.

Kubernetes Architecture

Kubernetes system is divided into:

a set of primary components that run as the control plane
a set of nodes that run containers. In Kubernetes, a node represents a computing instance like a machine. Note that this is different to a node on Google Cloud, which is a virtual machine running in computer engine

Cluster

Set of nodes grouped together
Even if one node fails, application is still accessible from other nodes (high availability)
Having multiple nodes also helps sharing the load (scalability)
Kubernetes cluster consists of two types of nodes, master nodes and worker nodes.

Master (master node)

responsible for managing the cluster
controls and schedules all activities in the cluster
stores the information about all members of the cluster
monitors nodes
when node fails, moves workload of the failed node to other worker nodes
Master is a node with Kubernetes installed on it and is configured as a master node
Master watches over the nodes in the cluster and is responsible for orchestration of containers on the worker nodes
Master nodes host the K8s control plane components. The master node will hold configuration and state data used to maintain the desired state.

Node (worker node, minion)

machine, physical or virtual, on which Kubernetes is installed
worker machine on which containers will be launched by Kubernetes; workers run containers
if node fails, our application will go down => we need to have more nodes

When we install Kubernetes on the host, we install multiple components on it.

There are two types of nodes/servers: master and worker. And there is a set of components that make up Kubernetes. How are these components distributed across different types of servers? How does one server become a master and the other the slave?

master (controller) server (node):

API Server (kube-apiserver | Kubernetes)

this is what makes node a master
provides REST API and acts as the front end of Kubernetes cluster
users, management devices, command line interfaces talk to it in order to interact with Kubernetes cluster

etcd service

All the information gathered are stored in a key value store based on the popular etcd framework
name is the abbreviation of Experimental Distributed Tracing Service (?)
key store
distributed reliable key-value store used by Kubernetes to store all data used to manage the cluster
when we have multiple nodes and multiple masters in the cluster, etcd stores all that information on all the nodes in the cluster in the distributed manner
responsible for implementing locks within the cluster to ensure there are no conflicts between the masters

Controller Manager (kube-controller-manager | Kubernetes)

brain behind the orchestration
responsible for noticing and responding when nodes, containers or endpoints go down
make decisions to bring up new containers in such cases

Scheduler (kube-scheduler | Kubernetes)

responsible for distributing work of containers across multiple nodes
it looks for newly created containers and assigns them to nodes

Master components
(credit: DevOps with Kubernetes by Hideto Saito, Hui-Chuan Chloe Lee and Cheng-Yang Wu)

(I/F = Interface)

This article describes well the control plane of the master node:

The Kubernetes API Server. In Kubernetes, all communications and… | by Rini Thomas | Medium

API Server and its clients
(image credit: Rini Thomas; source: https://medium.com/@rinithomas/the-kubernetes-api-server-430a39aec2d7)

All communications and operations between the control plane components and external clients, such as kubectl (see Introduction to kubectl), are translated into RESTful API calls that are handled by the API server.

Effectively, the API server is a RESTful web application that processes RESTful API calls over HTTP to store and update API objects in the etcd datastore.

Control Plane on the master/controller node(s) consists of the API server, controller manager, and scheduler.

API server is the central management entity and the only component that talks directly with the distributed storage component etcd.

API server has the following core responsibilities:

To serve the Kubernetes API. This API is used :
cluster-internally by the:
master components
worker nodes
our Kubernetes-native apps
externally by clients such as kubectl
To proxy cluster components, such as the Kubernetes dashboard, or to stream logs, service ports, or serve kubectl exec sessions.

Serving the API means:
Reading state: getting single objects, listing them, and streaming changes
Manipulating state: creating, updating, and deleting objects.

kubectl command is translated into an HTTP API request in JSON format and is sent to the API server. Then, the API server returns a response to the client, along with any requested information.

API server is stateless (that is, its behavior will be consistent regardless of the state of the cluster) and is designed to scale horizontally. Usually, for the high availability of clusters, it is recommended to have at least three instances to handle the load and fault tolerance better.

API Server internal processes
(image credit: Rini Thomas; source: https://medium.com/@rinithomas/the-kubernetes-api-server-430a39aec2d7)

worker node (minion):

is where the containers are hosted e.g. Docker containers.
kubelet service (agent) [kubelet | Kubernetes]

the agent that runs on each node in the cluster
interacts with a master to provide health information of the worker node and carry out actions requested by the master on the worker nodes
makes sure that containers are running as expected

Container Runtime

underlying software required for running containers on a system
Container Runtime can be be Docker, rkt or CRI-O
in our case it's Docker but there are other options as well

Node components
(credit: DevOps with Kubernetes by Hideto Saito, Hui-Chuan Chloe Lee and Cheng-Yang Wu)

On both master and worker nodes runs kube-proxy | Kubernetes.

Understanding what components constitute the master and worker nodes will help us install and configure the right components on different systems when we set up our infrastructure.

Quiz:

What is a worker machine in Kubernetes known as?

A Node in Kubernetes can only be a physical machine and can never be a virtual machine.

Multiple nodes together form what?

Which of the following processes runs on Kubernetes Master Node?

Which of the following is a distributed reliable key-value store used by kubernetes to store all data used to manage the cluster?

Which of the following services is responsible for distributing work or containers across multiple nodes.

Which of the following is the underlying framework that is responsible for running application in containers like Docker?

Which is the command line utility used to manage a kubernetes cluster?

Why do we need Kubernetes?

Let's for a moment keep Kubernetes out of our discussion and talk about simple Docker containers. Let's assume we were developing a process or a script to deploy our application on a Docker host. Then we would first simply deploy our application using a simple docker run command, and the application runs fine and our users are able to access it:

docker run python-app

When the load increases, we deploy more instances of our application by running the docker run commands many more times:

docker run python-app --name app1
docker run python-app --name app2
docker run python-app --name app3
docker run python-app --name app4

Sometime in the future our application is further developed, undergoes architectural changes and grows and gets complex. We now have a new helper container that helps our web application by processing or fetching data from elsewhere (NOTE: --link is a legacy option for docker run; it is recommend using user-defined networks to facilitate communication between two containers instead of using --link; see Legacy container links | Docker Docs):

docker run helper --link app1
docker run helper --link app2
docker run helper --link app3
docker run helper --link app4

These helper containers maintain a 1 to 1 relationship with our application container and thus needs to communicate with the application containers directly and access data from those containers. For this, we need to (manually):

maintain a map of what app and helper containers are connected to each other
establish network connectivity between these containers ourselves using links and custom networks
create shareable volumes and share it among the containers. We would need to maintain a map of that as well.
monitor the state of the application container

When it dies, manually kill the helper container as well as it's no longer required.
When a new container is deployed, we would need to deploy the new helper container as well with pods.

Kubernetes does all of this for us automatically. We just need to define what containers a pod consists of and the containers in a pod by default will have access to the same storage, the same network namespace and same fate as in they will be created together and destroyed together.
Even if our application didn't happen to be so complex and we could live with a single container, Kubernetes still requires you to create pods, but this is good in the long run as your application is now equipped for architectural changes and scale in the future.
However, multi-containers pods are a rare use case. Single containers per pod is the most common use case.

To learn more about pods, please read the next article in this series: Kubernetes Pods | My Public Notepad