This piece is the second part of a two-part series
In Part One of this series, we discussed:
- What is Kubernetes?
- What problems does it aim to solve?
- When should one choose to use Kubernetes? What alternatives are available?
In this piece, we will explore
- What are the design principles and architecture of Kubernetes?
- How to use Kubernetes, and a simple example.
To understand an example that describes how to deploy applications on Kubernetes, one should first have a preliminary understanding of Kubernetes architecture and objects. Thus, we will first outline the design principles and architecture of Kubernetes, followed by a brief explanation of relevant Kubernetes objects, and, finally, the example itself.
Design Principles and Architecture Behind Kubernetes
Kubernetes is architected to abide by a set of design principles. To better understand why Kubernetes is architected the way it is, one should be familiar with these principles. So, let’s start our discussion there.
Design principles of Kubernetes
- Portable: Kubernetes can run anywhere. Kubernetes runs with consistent behavior across various environments — public cloud, private cloud, on-premise or personal laptop. Applications deployed on Kubernetes can be ported across different environments with minimal effort.
- General-purpose: Kubernetes doesn’t put any restrictions on what type of applications can be deployed through it. Although it focuses on deployment and management of micro-services and cloud-native applications, any type of workload (batch jobs, stateless or stateful services, legacy monolithic single instance applications) can be deployed through Kubernetes. Applications could be written in any language or framework without any restrictions.
- Flexible: Kubernetes allows for many parts of its functionality to be substituted with custom, built-in solutions. This gives the ability to use a specialized solution along with Kubernetes wherever necessary. To ensure this flexibility, Kubernetes is built as a collection of pluggable components and layers.
- Extensible: Kubernetes facilitates the addition of specialized capabilities whenever necessary. This is achieved by exposing interfaces, which could be implemented to add new functionality on top of existing functionality. This allows for numerous add-ons to be developed for Kubernetes.
- Automatable: Kubernetes aims to reduce the burden of manual operations. Once configured, applications deployed through Kubernetes will scale and self heal without any manual intervention. Kubernetes could be integrated with a Continuous Integration (CI) pipeline, allowing a code change committed by a developer to be deployed onto the test environment automatically.
Each of these principles adds great value to the end user who is using Kubernetes. Portability allows for reliable testing of the application on various environments, such as testing and production, and prevents getting locked in with a single cloud-provider or vendor.
General purpose gives developers the freedom to choose the exact development tools and frameworks necessary to meet the business functionality, without worrying about the infrastructure or deployment.
Flexibility and extensibility allow the addition of customized functionality wherever the built-in functionality is not sufficient.
Automatability ensures that manual work necessary for the maintenance of a large-scale application is kept a minimum. This allows for a relatively small team to successfully maintain a large-scale, distributed application deployed on the cloud.
Let’s now discuss the Kubernetes architecture which was developed keeping these principles in consideration
Architecture of Kubernetes
High-level Kubernetes comprises of the master system and workers. The master system controls the workers and runs applications on them. The desired state of the cluster (compute resources) is represented as abstract objects. These abstract Kubernetes objects are records of intent. Kubernetes will constantly work to ensure that the state represented in these abstract objects is the actual physical state of the cluster. An external client could connect to the master and obtain information about the cluster state and issue commands to change it as per requirement.
Whenever one wishes to update the physical state of the cluster, all they would have to do is update the abstract Kubernetes objects, and Kubernetes will take care of the rest. Let’s dive deeper and briefly discuss the components of the master system and workers.
Components of Kubernetes master
The Kubernetes master system, also known as the control pane, is designed as a set of components. Let’s briefly discuss it’s key components.
- API server: Kubernetes mostly uses REST API for internal and external communication. All the abstract Kubernetes objects are exposed as REST resources. API server is the component that is responsible for processing the REST requests, validating them, and performing appropriate CRUD operations on corresponding abstract Kubernetes objects.
- Cluster State Store: To perform the CRUD operations, API server would need a backing data store. As the name indicates, cluster state store is a persistent storage instance which stores the state of all the abstract Kubernetes objects configured in the system. The cluster state store has support for watch functionality. Through this functionality, all the coordinating components could be quickly notified whenever a change is made to an object.
- Controller Manager: This is the component of master that runs controllers. Controllers run loops and monitor the actual cluster state and state represented in the abstract Kubernetes objects. Whenever a change to cluster state is notified, they are responsible for performing necessary actions, such that the actual state and the abstract state are consistent with each other. Kubernetes has numerous controllers, each one responsible for a different set of Kubernetes objects.
- Scheduler: It is the component of the master responsible for allocating physical resources on the cluster to run applications/jobs added to the abstract data store. These scheduling decisions are made taking into account numerous factors like hardware/software constraints, among others.
A Kubernetes master system could have multiple replicas of each of these components to ensure high availability, and could be deployed along with worker node components on a single physical instance. However, for simplicity, setup scripts typically start all master components on the same machine, and do not run any worker instances on this machine.
The exact cluster setup is dependent on the requirements of the end user. For smaller applications, a single instance with both master and worker components is more than sufficient. For larger applications customized effort is essential for configuring the Kubernetes cluster.
Components of Kubernetes worker
The worker instances, or nodes, are also composed of multiple components. The main function of Kubernetes worker components is to process the instructions from master and execute them on the node. The following are the key components of a worker node:
- Kubelet: It is the component of worker responsible for making sure that the containers scheduled by the master on this node are running and are healthy.
- Container runtime: Container runtime is the software that is responsible for running containers. Kubernetes supports several runtimes and any implementation of the Kubernetes CRI. Each worker node uses this to run the containerized applications scheduled by the master. Running non-containerized applications is discouraged and not supported by Kubernetes.
- Kube proxy: It is the component of Worker responsible for maintaining network rules on the worker and performing connection forwarding. This essentially enables efficient and effective communication throughout the cluster. External application traffic will get redirected to the appropriate container through these components.
External Kubernetes client
Theoretically, external Kubernetes client could be any application that can communicate with API server through the well-defined REST API. But the most predominant choice is to use Kubectl.
Kubectl is a command line tool that is intended to be used by an end-user responsible for managing application deployments. Kubectl users can execute commands on a terminal. Each of these commands is converted into an API call in the background and sent to the API server on Kubernetes master, where necessary action will be performed.
Let’s take a step back and look at the overall architecture of Kubernetes. One could notice that it is designed as a set of loosely coupled components working together, instead of a single monolithic instance being responsible for all the functionality. We already discussed the various advantages of such an architectural style. In particular, this choice allows Kubernetes to stay flexible and extensible.
The choice to use rest to create and update the cluster configuration ensures that any configuration created on one environment will work on any other environment. This allows application deployments created on Kubernetes to remain portable.
Controller manager and scheduler act as components which continuously watch for changes to the abstract objects in cluster state store. They send instructions to worker nodes whenever necessary to automatically update the actual state of cluster This design choice eliminates a great deal of manual work and ensure Kubernetes is autonomic. In fact, this declarative approach to cluster management was one of the main features which lead to the rapid adoption of Kubernetes.
The choice to run only containerized applications by interacting with the container runtime through an interface ensures that any type of application could run on Kubernetes, and allows Kubernetes to remain general purpose.
We can now look into the abstract objects used to represent and manage the cluster state in Kubernetes. Knowledge of these Kubernetes objects is the final piece of the puzzle we need to understand before we dive into the example
Kubernetes defines a large number of abstract objects. For brevity’s sake, we will only discuss those Kubernetes objects that are absolutely essential for understanding our example.
- Pod: We know thatm through Kubernetes, we could run containerized applications. Instead of abstracting a single container as a Kubernetes object, Kubernetes defines pod, which is a group of one or more containers. There is an advantage that comes with this choice. For simpler cases, each pod in the system could represent a single container. But, whenever there is a need to deploy additional capabilities that are not directly related to the core business functionality of the container — like support for logging, caching, etc — we have an option to package these additional capabilities into separate containers and place them in a single pod. This ensures they always stay logically together. Pods are the smallest deployable units of computing that can be created and managed in Kubernetes. It is the place where the actual application code implemented by the end-user runs. Each pod has it’s own IP address and is completely decoupled from the host.
- Service: In Kubernetes, pods are volatile. To ensure high availability and optimum use of compute resources, Kubernetes could dynamically kill and create pods. Because of this, the IP address of a pod is not a reliable way to access business functionality offered by the pod. Instead, Kubernetes recommends using a service to access the business functionality. Kubernetes service is an abstraction which defines a logical set of pods and a policy to access them. Every Kubernetes service has an IP address, but unlike the IP address of a pod, it is stable. A Kubernetes service continuously keeps track of all the pods in the system, and identifies the pods it is expected to target. Whenever a request to access a particular business functionality reaches the service, it will redirect the request to the IP address of one of the pods that are active in the system at that point in time. Ideally, to access the pods from outside the cluster, one must use Ingress. As of now, however, the Kubernetes Ingress feature is still beta. Thus, in this example, we will use a service to expose the traffic externally as well.
- Persistent- Volume and Persistent-Volume Claim: Managing storage is a distinct problem from managing compute. Kubernetes defines two key abstractions to handle this problem, persistent volume, and persistent volume Claim. In Kubernetes, a persistent-volume is a piece of storage in the cluster that has been provisioned to be used by the cluster for its storage requirements. A persistent-volume claim is a request by an application to consume the abstract storage resources declared through persistent volume. To make persistent storage available to the applications running inside Kubernetes, one should first declare persistent volume and then configure the application to make a claim to use that volume.
- ConfigMap: Configmap is a Kubernetes abstraction meant to decouple environment-dependent application-configuration-data from containerized applications, allowing them to remain portable across environments.
- Secrets: A secret is an object that contains a small amount of sensitive data such as a password, a token, or a key. Putting such sensitive information in a secret allows for more control over how it is used and reduces the risk of accidental exposure.
- Deployment: Deployment is an abstraction meant to represent the desired state of an actual deployment on Kubernetes. A deployment object typically contains all the information required — the location to obtain and build containerized applications, configuration of pods expected to package and run these containers,the number of replicas of each pod that should be maintained, the location of application configuration in terms of config-maps and secrets meant to be used by the containers, configuration of data storage (if the application needs persistent data storage). All of these could be declared inside deployment. Although it is possible to create individual pods and services in Kubernetes it is recommended that one use deployment to manage deployments. By using the deployment object, typical operations like roll-out, roll-back, and monitoring are greatly simplified.
How To Use Kubernetes, and a Simple Example
Now that we have gone through the basics of Kubernetes, we will take a detailed look at a simple example. In this example, we will deploy a web application using Kubernetes. We will be using Docker as our container runtime. The application in our example has three distinct parts:
- Database (MySQL server)
- Back end (Java Spring Boot application)
- Front end (Angular app)
We will deploy all of these components on a Kubernetes cluster. We will have one replica of database, two replicas of back end and two replicas of front end. Front-end instances will communicate with back end through HTTP. Back-end instances will communicate with the database. To facilitate this communication, we have to configure Kubernetes accordingly.
We will configure the cluster by creating Kubernetes objects. These Kubernetes objects will contain the desired state of our deployment. Once these objects are persisted into the cluster state store, the internal architecture of Kubernetes will take necessary steps to ensure that the abstract state in the cluster state store is the same as the physical state of the cluster.
We will use kubectl to create the objects. Kubernetes supports both imperative and declarative ways of creating objects. Production environments are generally configured by the declarative approach. We will use the declarative approach in this example. For each object, we will first prepare a manifest file, a yaml file containing all the information related to the object. Then we will execute the kubectl command, kubectl apply -f <FILE_NAME>to persist the object in the cluster state store.
We will first containerize the application code we have implemented. After this, we will configure the deployment of our database followed by back end. We will finish the example by configuring the front end.
Step 1. Containerize the application and upload image to container image registry
The first step would be to create a container image of the application we have implemented and upload it to container registry.
A container image is a packaged form of the containerized application. It can be transferred across computers, just like any normal file. The container runtime environment can create a running instance of a containerized application using the container image.
Container registry is generally the centralized repository where container images are stored. One could upload container images to a container registry and download them wherever and whenever they are needed. There are numerous container registry services available: Azure Container Registry, Google Container Registry, Amazon ECR, etc. We will use Docker hub for this example, but one could use any image registry (public or private) that fits their use case.
The application we are going to deploy has front end implemented with Angular Framework, and back end implemented with Spring Boot Framework. Links to GitHub repositories containing the code are provided in the final section of this piece. Once we have implemented the code as per our requirements, we will build executables with build tools (Angular CLI and Maven in this case).
Once container images are created, we can upload these to any container image registry. Here we will upload these images to Docker Hub. We have uploaded the front-end image with the namekubernetesdemo/to-do-app-frontend, and the back-end image with the name kubernetesdemo/to-do-app-backend. We will obtain the database image from the official MySQL docker repository mysql. Official Docker images generally do not have any prefix, like mysql. Unofficial images are required to have a prefix like kubernetesdemo/here.
We have to mention the name of these images in the Kubernetes manifest files, which we will see below. Kubernetes will fetch and run these images on respective cluster nodes whenever required.
Step 2. Set up Kubernetes cluster and CLI
There are numerous solutions available for setting up a Kubernetes cluster. Different Kubernetes solutions meet different requirements: ease of maintenance, security, control, available resources, and expertise required to operate and manage a cluster. One could refer to the official documentation for more details about how a cluster could be set up. This example has been replicated on both local (Minikube) and cloud provider (GKE) setup. Kops is a project that aims to simplify the Kubernetes cluster setup process.
As mentioned earlier, we will use kubectl as our CLI. Instructions for installing kubectl can be found here. Once kubectl is installed, it should be configured to communicate with the Kubernetes cluster we have set up. In the case of Minikube, minikube start command will automatically configure kubectl. For cloud setup, instructions can be found in their respective quick start guide(eg: GKE).
Step 3. Database configuration setup
Back-end instances need to communicate with the database. All the configuration details required to connect with the database are stored in a configuration file.
Let’s take a look at the back-end spring configuration file in this example
This configuration file expects some environment variables, like DB_USERNAME, DB_PASSWORD, DB_HOST, DB_NAME . We will pass the values of these variables to Kubernetes through configMaps and secrets. Then we will configure the back-end pod to read the environment variable from the configMaps and secrets .
The MySQL Database docker image expects some environment variables. We will need to configure the following environment variables MYSQL_ROOT_PASSWORD, MYSQL_USER, MYSQL_PASSWORD, MYSQL_DATABASE.
Now that we have an idea about the configuration required for our application, we will create configMaps and secrets in our Kubernetes cluster with required data.
First, to hold database specific information, we will create one configMap and two secrets. The configMap will contain non-sensitive information about the database setup, like the location where the database is hosted and the name of the database. We will define a Kubernetes service, this will expose the location of the database. Kuberebetes DNS will resolve the service name to actual ip address of the database during runtime. Below is the configMap, which is used to store non-sensitive information related to the database in this example
We will use two secrets to store sensitive data. The first secret will contain the database root user credentials and the second secret will contain application user credentials. Below are these two files
By executing kubectl apply -f <FILE_NAME>we will create the ConfigMap and Secret objects in our Kubernetes cluster state store. We stored the values of host and name in this ConfigMap object and username and password in Secret object. We will access thesesecrets and ConfigMaps in later steps to configure our deployments .
Similarly, the configuration of front end expects environment variable SERVER_URI which will indicate where back end is hosted. We will create this configMap after configuring back-end Deployment
Step 4. Configure PVC, service, and deployment for database
Our next step would be to create the services and deployments required for our database setup. Below is the file which creates relevant Kubernetes Service and Kubernetes Deployment for the database setup in this application.
Through this file, we created multiple Kubernetes objects. First, we created a Kubernetes Service with the name mysql for accessing the pod running the MySQL container. Next, we created a Persistent Volume Claim (PVC) of one GB, this will result in Kubernetes cluster dynamically allocating the required persistent storage for MySQL (enable default dynamic storage, if it is not enabled in your cluster). After this, we created a Deployment object, which configures the deployment of MySQL Server in the cluster. Into the MySQL container, we injected environment variables like MYSQL_ROOT_PASSWORD, MYSQL_USER, MYSQL_PASSWORD , and MYSQL_DATABASE using the configMaps and services we created in the previous step.
Step 5. Configure service and deployment for back end
Next, we set up our back-end application deployment. Below is the yaml file which creates the required Kubernetes objects.
Here we first created a Service of type LoadBalancer (use NodePort if you are running Kubernetes locally) which exposes the back-end instances. Loadbalancer type provides an External-IP , through which one could access the back-end services externally. (use minikube ipwith the port if you are using minikube). Next, we created the Deployment object configured to contain two replicas of the back-end instance. And then injected the required environment variables from the configMaps and secrets we have created earlier. This deployment will use the image kubernetesdemo/to-do-app-backend which we created in step one.
Step 6. Front-end configuration setup
Front end expects the value of External-IP of back end, generated in the above step, to be passed in the form of the environment variable SERVER_URI. We will now create a config map to store this information related to the back-end setup.
We will use this configMap to inject SERVER_URI value when configuring the deployment of front end, in the next step.
Step 7. Configure service and deployment for front end
Next, we set up our front-end application deployment. Below is the yaml file which creates the required Kubernetes objects.
Here, we first created a Service of type LoadBalancer (use NodePort if you are running Kubernetes locally) which exposes the front-end instances. Loadbalancer type provides an External-IP , through which one could access the front-end services externally. (use minikube ipwith the port if you are using minikube). Next, we created the Deployment object configured to contain two replicas of the front-end instance. This deployment will use the image kubernetesdemo/to-do-app-frontend, which we created in step one. After this, we injected the environment variable SERVER_URI from configMap, which we created in the above setup.
That’s it. Our simple application is now completely deployed. After this, the front end of the application should be accessible using front-end service External-IP from any browser. The Angular app will call the back end through HTTP and back end will communicate with MySQL database, where our application data is persisted. The image below shows the overall setup described in this example
This entire deployment is now managed by Kubernetes. If one of the pods goes down for unknown reasons, Kubernetes will bring up a new pod without any manual intervention. Using kubectl, we could monitor and update this deployment, whenever required.
- The GitHub Repository Containing the manifest files used for deployment and configuration of Kubernetes Cluster discussed in this piece can be found here.
- The GitHub Repository Containing the back-end implementation discussed in this piece can be found here.
- The GitHub Repository Containing the front-end implementation discussed in this piece can be found here.