Introduction to Kubernetes

Hey guys, I hope you read my previous blogs. If so, thank you and you'll enjoy reading another blog now.

This blog is on Kubernetes as you've seen in the title. We'll cover the introduction to Kubernetes, its components, architecture and some kubectl commands.

Let's get started

What is Kubernetes?

Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications.

Let's understand some components of Kubernetes

Components of Kubernetes

1. Pod:

The smallest unit of k8s
It’s an abstraction(layer) over container
Usually, a pod has 1 main application in it but can hold multiple too
Each pod gets its own IP address(an internal Ip address to communicate with DB).
A pod dies very quickly so the connection to DB is lost and a new DB is created and again a new IP address will be assigned

Because of the above issue, another component of k8s called Service is used.

2. Service

Service is a static/permanent IP address that is attached to each Pod.
The pod will have its own service and DB will have its own service
Lifecycle of the Pod and service is not connected. So even if the pod dies, the service and its IP address stay.

Since we want our application to be accessible through a browser we need a service that opens communication through an external service.

But we don’t want our DB to be open to the public and for that, you would create an internal service. This is a type of service that you specify when creating one.

Sample URL of an external service: http://124.89.101.2:8080

But we use https://myapp.com (because of security in HTTPS and domain name) for the end product rather than using an IP address which would be good for testing.

And for that purpose, we have another component of k8s

3. Ingress:

Instead of service, the request first goes to Ingress and it forwards to service.
It is used to route traffic into the cluster.

4. ConfigMap:

It’s an external configuration of your application.
It contains configuration data of your application like URL of database, name of the database and many services
The use of ConfigMap is that when you change the name of the database you just need to update the name in ConfigMap which directly changes name in Pod also because Pod is connected to ConfigMap.
So instead of rebuilding, pushing into the repo and pull into your pod, you just have to update configMap.

Pod of the external configuration also contains name of the server, username and password which may change in the application deployment process.

Putting these credentials in configMap in a plaintext will be insecure even though it’s an external configuration.

So for this purpose, k8s has another component

5. Secret:

It is similar to ConfigMap. But it is used to store secret data like credentials.
And for security, it stores passwords in base64 encoded format instead of plaintext.

6. Volumes:

In k8s every time the pod restarts the DB loses all its data which is very inconvenient. So there’s an external storage like a hard drive attached to DB known as Volumes.
That external storage may be maintained in a local machine or remotely which means outside the k8s cluster.
The only reason we are using volumes is that k8s doesn’t manage data persistence.

So everything’s fine till now and the user can access the application but sometimes the pod dies because of small changes in the pod which leads to downtime and the user can’t access the data.

7. Deployments:

For the above purpose, in k8s we replicate pods to avoid these situations. We don’t create Pods again but generate a blueprint of Pods and specify how many replicas of pods you would want to run.

And that blueprint is called deployment.
In practice we don’t use or create pods but create deployments and then we specify how many replicas of pods we need.
Deployment is an abstraction of pods and pods are an abstraction of containers.

So now if your pod crashes at any time there’s no worry because it’s already replicated to another pod. But, we also need our DB to be replicated as well but we can't replicate Db due to data inconsistencies. And for that purpose, we have another k8s component.

8. Stateful Set:

It is meant specifically for databases like MySQL, MongoDB, elastic search should be created by stateful sets and not deployments.
It is responsible for replicating databases just like deployment for pods. But making sure that database reads and writes are synchronized so that no database inconsistencies are offered.
However, deploying database applications with Stateful Set is tedious.

That’s why it’s also a common practice to host database applications outside of the k8s cluster and just have the deployments or stateless applications that replicate and scale with no problem inside of the k8s cluster and communicate with the external database.

So now even if one of our pods and database gets crashed, we have a replica of it avoiding downtime.

Kubernetes Architecture

Kubernetes Cluster mainly consists of Worker Nodes and a Master(now called a Control Plane).

Worker Node:

Each Node has multiple pods on it
Worker Nodes do the actual work like running those pods with containers inside

3 processes must be installed on every Node

1. Container runtime: I used Docker as the Container runtime.

The applications have containers in them and that’s why we need container runtime installed on every node.

2. Kubelet: It interacts with both the container and node

Kubelet is responsible for starting the pod with a container inside it and assigning resources from that node to the container like CPU, RAM etc.

3. Kube-proxy: It is responsible for forwarding requests from services to pods

Kube proxy makes sure that communication works in a performant way with low overhead.

Master Node:

4 processes run on every master node that controls the cluster state and the worker nodes as well.

1. API Server:

So if you want to deploy an application on the k8s cluster you interact with Api Server using some client.
It’s like a cluster gateway that gets the initial requests of any updates into the cluster or even the queries from the cluster.
It acts as a gatekeeper for authentication to make sure that only authenticated requests get through the cluster.

Whenever you want to schedule new pods, deploy new applications, create a new service or any other components you have to interact with API server on the master node which will validate your request and if everything’s fine it will forward your request to other processes to schedule the pod that you’ve requested.

2. Scheduler:

Now, after the request is sent to Api server it will forward the request to Scheduler to start the Pod on one of the worker nodes. And the scheduler has an intelligent way of deciding on which specific worker node the next pod will be scheduled.
First, the scheduler sees the resources(CPU utilization, RAM) used by the worker nodes and then it will assign the new Pod to the node that is less busy or has more resources available.

Note: The scheduler just decides on which Node new Pod should be scheduled.

The process that actually does the scheduling or that starts the Pod with the container is the Kubelet.

So, kubelet gets the request from the scheduler and executes the request on the node.

3. Controller Manager

So, when pods die there must be a way of detecting that node died and reschedule those pods as soon as possible and that’s what the controller manager does.
It detects cluster state changes and the crashing of pods.
So when pods die controller manager detects that and tries to recover the cluster state as soon as possible and for that, it sends the request to the scheduler to reschedule the pods and the same procedure(Scheduler) happens and it sends the request to kubelet and then it executes.

4. etcd:

It is a key-value store of a cluster state. It’s like a cluster brain.
Cluster changes like new pods, rescheduling etc. will be stored in the key-value store.
The reason it’s called cluster brain is all the mechanism like the scheduler and controller manager works because of its data.

Note: Application data is not stored in etcd. Only cluster state data is stored.

In practice, k8s cluster is made up of multiple masters.

Example Cluster Set-Up:

For a small cluster, you will probably have 2 master nodes and 3 worker nodes.
The master processes have less work and they need fewer resources.
But the worker processes do the actual work like running those pods with containers inside and they need more resources.
As your application's complexity and its demand for resources increases you may add more master and node servers to your cluster.

Add new Master/Node server:

1. get new bare server

2. install all the master/worker node processes

3. add it to the cluster

Minikube

It's basically a one-node cluster where the master processes and the worker processes both run on one node. And this node will have Docker container runtime pre-installed in it.

It runs through the virtual box in a computer.

To put it simply:

creates a virtual box on your laptop
Node runs in that virtual box
1 Node k8s cluster
Used for testing purposes

Kubectl

So now that you have MiniKube as a virtual node on your laptop, you need some way to interact with the cluster.
You need a way to create pods and other k8s components on the node and the way to do it is Kubectl.
It is a CLI tool for the k8s cluster.
So, as mentioned above if you want to do anything on the k8s cluster you must first interact with the API server. And you have some ways like UI, API and CLI(kubectl) to interact with the API server. Kubectl is the most powerful of the 3 clients.
Once the Kubectl submits the commands to API server, the worker processes on minikube node will actually make it happen.
Kubectl is the tool to interact with any type of Kubernetes cluster setup.

Some important points to be noted when working on the k8s cluster

Layers of Abstraction:

1. Deployment manages a replica set

2. A replica set manages all the replicas of that pod

3. Pod is an abstraction of a container

Docker Container

If the deployment is deleted all the pods and replica sets will also get deleted

Browser Request Flow through K8s components:

The request comes from the browser ➡️ It goes to the external service of the mongo express ➡️ which will then forwarded to the mongo express pod ➡️ the pod will then connect to internal service of mongo DB which is the DB Url (ConfigMap) ➡️ it will be forwarded to mongo DB pod where it will authenticate the request using credentials(Secret).

K8s Namespaces

What is a Namespace?

In k8s you can organize resources in namespaces
It's like a virtual cluster inside a cluster

When you create a cluster, by default k8s gives you 4 namespaces

Kube-system:

- This is not meant for our use.

- We shouldn’t create or modify anything in kube-system

- The components deployed in this namespace are system processes.

- Master and Kubectl processes
Kube-public:

- This contains the publicly accessible data

- It has a config map that contains cluster information that is accessible without authentication.
Kube-node-lease:

- It holds the information about heartbeats of nodes

- Each node has an associated lease object in the namespace

- Determines the availability of a node
Default:

- We use this namespace to create resources at the beginning if we haven’t created a namespace

Let's see some Kubectl commands

Kubectl & Minukube Commands

kubectl version - shows the client version and server version
minikube start - creates and starts the cluster
kubectl create deployment [NAME] --image=image - creates a deployment which is an abstraction of pod by pulling an image from docker hub.
kubectl get all - lists all the running resources
kubectl get nodes - gets the status of nodes
kubectl get pods - gets the status of pods
kubectl get deployment - gets the status of deployment
kubectl get replicaset - gets the status of replica
kubectl edit deployment [NAME] - opens a configuration file of that image and you can edit and close it
kubectl describe pod [podname] - displays the detailed info about the pod
kubectl logs [podId] - shows all the log details of that pod
kubectl exec -it [podId] -- bin/bash - moves into the terminal of that container
kubectl delete deployment [NAME] - deletes that deployment
kubectl apply -f [fileName.yaml] - a deployment is created with the custom configurational file and can also be used for configuring after creating
kubectl get pod -o wide - displays more information about a pod
minikube service mongo-express-service - assigns an external Ip address to the service
kubectl apply -f fileName --namespace=[name] - creates a component in a Namespace

It's a wrap for this blog guys. I hope you've learned something useful from this blog.

My upcoming blogs will be on AWS and Kubernetes part-2.

To Contact me:

Twitter: twitter.com/Chris__Jonathan

LinkedIn: linkedin.com/in/chris-jonathan

Thank you guys😊