Last chance! 7 days left!

Click for a free read!

[Kubernetes Data Platform][Part 2.1]: Highly available Kubernetes cluster with kubeadm

Viet_1846

9 min readJul 31, 2024

In this article, I will guide you through the steps to install a Highly Available (HA) Kubernetes cluster on VirtualBox virtual machines using kubeadm.

Kubeadm is a tool built to provide best-practice “fast paths” for creating Kubernetes clusters. It performs the actions necessary to get a minimum viable, secure cluster up and running in a user-friendly way. Kubeadm’s scope is limited to the local node filesystem and the Kubernetes API, and it is intended to be a composable building block of higher level tools.

The architecture is as follows:

Kubeadm HA Architecture

DEPLOYMENT STEPS

1. Initialize Virtual Machines:

  • Create VirtualBox VMs using Vagrant. Configure the VMs in the Vagrantfile. Adjust RAM and CPU as needed.
  • This will create 7 VMs: loadbalancer (172.16.16.100), kmaster1 (172.16.16.101), kmaster2 (172.16.16.102), kmaster3 (172.16.16.103), kworker1 (172.16.16.201), kworker2 (172.16.16.202), kworker3 (172.16.16.203).

2. Install and Configure HAProxy:

  • Install and configure HAProxy on the loadbalancer VM. HAProxy acts as the load balancer for the Kubernetes cluster. (Details in loadbalancer.sh)

3. Configure and Install Basic Tools:

  • On all nodes, configure and install basic tools: containerd runtime, kubeadm, kubelet, and kubectl. (Details in prepare_all_k8s_node.sh)

4. Initialize Kubernetes Cluster:

  • Initialize the Kubernetes cluster on master node 1. Obtain the join commands for master and worker nodes.
  • Install Calico network for the cluster. (Details in bootstrap_kmaster_1.sh)

5. Join Master Nodes:

  • Join master nodes 2 and 3 to the cluster using the join master command obtained from step 4.

6. Join Worker Nodes:

  • Join worker nodes 1, 2, and 3 to the cluster using the join master command obtained from step 4.

7. Install Basic Kubernetes Components:

  • Install Nginx Ingress Controllor: Ingress controller for Kubernetes using NGINX as a reverse proxy and load balancer.
  • Install Longhorn: Distributed block storage system for Kubernetes. Longhorn is cloud-native storage built using Kubernetes and container primitives.
  • Install Metrics Server: Scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.

8. Install Testing Tools:

  • Install tools to test cluster functionality.

HANDS-ON STEP

Reference Repository: https://github.com/viethqb/data-platform-notes/tree/main/kubernetes/kubeadm

IMPORTANT: Convention for Bash Scripts

Throughout this series, the following convention applies to Bash scripts:

Commands in the form “> command” should be executed on the local laptop.

Commands in the form “root@kworker1 command” should be executed on the kworker1 VM as the root user.

1. Initialize Virtual Machines

Vagrantfile

# -*- mode: ruby -*-
# vi: set ft=ruby :

ENV['VAGRANT_NO_PARALLEL'] = 'yes'

Vagrant.configure(2) do |config|

config.vm.provision "shell", path: "bootstrap.sh"

# Load Balancer Node
config.vm.define "loadbalancer" do |lb|
lb.vm.box = "generic/ubuntu2204"
lb.vm.hostname = "loadbalancer"
lb.vm.network "private_network", ip: "172.16.16.100"
lb.vm.provider "virtualbox" do |v|
v.name = "loadbalancer"
v.memory = 2048
v.cpus = 1
end
end

MasterCount = 3

# Kubernetes Master Nodes
(1..MasterCount).each do |i|
config.vm.define "kmaster#{i}" do |masternode|
masternode.vm.box = "generic/ubuntu2204"
masternode.vm.hostname = "kmaster#{i}"
masternode.vm.network "private_network", ip: "172.16.16.10#{i}"
masternode.vm.provider "virtualbox" do |v|
v.name = "kmaster#{i}"
v.memory = 2048
v.cpus = 2
end
end
end

NodeCount = 3

# Kubernetes Worker Nodes
(1..NodeCount).each do |i|
config.vm.define "kworker#{i}" do |workernode|
workernode.vm.box = "generic/ubuntu2204"
workernode.vm.hostname = "kworker#{i}"
workernode.vm.network "private_network", ip: "172.16.16.20#{i}"
workernode.vm.provider "virtualbox" do |v|
v.name = "kworker#{i}"
v.memory = 4096
v.cpus = 1
end
end
end

end

Start VirtualBox VM Using Vagrant

> cd  ~/Documents 
> git clone https://github.com/viethqb/data-platform-notes.git
> cd data-platform-notes/kubernetes/kubeadm

> vagrant up
> vagrant status
> ssh-copy-id [email protected]
> ssh-copy-id [email protected]
> ssh-copy-id [email protected]
> ssh-copy-id [email protected]
> ssh-copy-id [email protected]
> ssh-copy-id [email protected]
> ssh-copy-id [email protected]
# user: root
# password: admin
Start VirtualBox VM Using Vagrant

2. Install & Configure HAProxy

loadbalancer.sh

#!/bin/bash

## !IMPORTANT ##
#
echo "[TASK 1] Update /etc/hosts file"
cat >>/etc/hosts<<EOF
172.16.16.100 loadbalancer
172.16.16.101 kmaster1
172.16.16.102 kmaster2
172.16.16.103 kmaster3
172.16.16.201 kworker1
172.16.16.202 kworker2
172.16.16.203 kworker3
EOF


echo "[TASK 2] Install HAProxy"
apt-get update && apt-get install -y haproxy

echo "[TASK 3] Update HAProxy configs /etc/haproxy/haproxy.cfg"
cat >>/etc/haproxy/haproxy.cfg<<EOF

frontend kubernetes-frontend
bind *:6443
mode tcp
option tcplog
default_backend kubernetes-backend

backend kubernetes-backend
mode tcp
option tcp-check
balance roundrobin
server kmaster1 kmaster1:6443 check fall 3 rise 2
server kmaster2 kmaster2:6443 check fall 3 rise 2
server kmaster3 kmaster3:6443 check fall 3 rise 2
EOF


echo "[TASK 4] Restart HAProxy service"
systemctl restart haproxy.service

Install & Configure HAProxy

> scp loadbalancer.sh [email protected]:~
> ssh [email protected]
root@loadbalancer:~# bash loadbalancer.sh
root@loadbalancer:~# systemctl enable haproxy.service
root@loadbalancer:~# systemctl status haproxy.service

3. Configure and Install Basic Tools

prepare_all_k8s_node.sh

#!/bin/bash

## !IMPORTANT ##
#
## This script is tested only in the generic/ubuntu2204 Vagrant box
## If you use a different version of Ubuntu or a different Ubuntu Vagrant box test this again
#

echo "[TASK 1] Disable and turn off SWAP"
sed -i '/swap/d' /etc/fstab
swapoff -a

echo "[TASK 2] Stop and Disable firewall"
systemctl disable --now ufw >/dev/null 2>&1

echo "[TASK 3] Enable and Load Kernel modules"
cat >>/etc/modules-load.d/containerd.conf<<EOF
overlay
br_netfilter
EOF

modprobe overlay
modprobe br_netfilter

echo "[TASK 4] Add Kernel settings"
cat >>/etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

sysctl --system >/dev/null 2>&1

echo "[TASK 5] Install containerd runtime"
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq >/dev/null
apt-get install -qq -y apt-transport-https ca-certificates curl gnupg lsb-release >/dev/null
mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable"
> /etc/apt/sources.list.d/docker.list
apt-get update -qq >/dev/null
apt-get install -qq -y containerd.io >/dev/null
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup \= false/SystemdCgroup \= true/g' /etc/containerd/config.toml
systemctl restart containerd
systemctl enable containerd >/dev/null

echo "[TASK 6] Set up kubernetes repo"
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' > /etc/apt/sources.list.d/kubernetes.list

echo "[TASK 7] Install Kubernetes components (kubeadm, kubelet and kubectl)"
apt-get update -qq >/dev/null
apt-get install -qq -y kubeadm kubelet kubectl >/dev/null
apt-mark hold kubelet kubeadm kubectl

echo "[TASK 8] Update /etc/hosts file"
cat >>/etc/hosts<<EOF
172.16.16.100 loadbalancer
172.16.16.101 kmaster1
172.16.16.102 kmaster2
172.16.16.103 kmaster3
172.16.16.201 kworker1
172.16.16.202 kworker2
172.16.16.203 kworker3
EOF

bootstrap_kmaster_1.sh

#!/bin/bash

echo "[TASK 1] Pull required containers"
kubeadm config images pull

echo "[TASK 2] Initialize Kubernetes Cluster"
kubeadm init --control-plane-endpoint="172.16.16.100:6443" --upload-certs --apiserver-advertise-address=172.16.16.101 --pod-network-cidr=192.168.0.0/16
echo "[TASK 3] Deploy Calico network"
kubectl --kubeconfig=/etc/kubernetes/admin.conf create -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/tigera-operator.yaml
kubectl --kubeconfig=/etc/kubernetes/admin.conf create -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/custom-resources.yaml

# You can now join any number of the control-plane node running the following command on each as root:
# IMPORTANT: You also need to pass --apiserver-advertise-address to the join command when you join the other master node.

Copy script to all node

> scp bootstrap_kmaster_1.sh prepare_all_k8s_node.sh [email protected]:~
> scp prepare_all_k8s_node.sh [email protected]:~
> scp prepare_all_k8s_node.sh [email protected]:~
> scp prepare_all_k8s_node.sh [email protected]:~
> scp prepare_all_k8s_node.sh [email protected]:~
> scp prepare_all_k8s_node.sh [email protected]:~
Copy script to all node

Run on All k8s Nodes

> ssh [email protected]
root@kmaster1:~# bash prepare_all_k8s_node.sh
> ssh [email protected]
root@kmaster2:~# bash prepare_all_k8s_node.sh
> ssh [email protected]
root@kmaster3:~# bash prepare_all_k8s_node.sh
> ssh [email protected]
root@kworker1:~# bash prepare_all_k8s_node.sh
> ssh [email protected]
root@kworker2:~# bash prepare_all_k8s_node.sh
> ssh [email protected]
root@kworker3:~# bash prepare_all_k8s_node.sh
Run on All k8s Nodes

4. Initialize Kubernetes Cluster

Run on master 1

> ssh [email protected]
root@kmaster1:~# bash bootstrap_kmaster_1.sh
Initialize Kubernetes Cluster

You can now join any number of the control-plane node running the following command on each as root:

IMPORTANT: You also need to pass apiserver-advertise-address to the join command when you join the other master node.

kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146 \
--control-plane --certificate-key dd62cee26cdde801822734399730b6cbde061a6f45ba6171653b6b0d8ff94b1e \
--apiserver-advertise-address <master-node-ip>

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146

5. Join Master Nodes

Run on master 2

> ssh root@172.16.16.102
root@kmaster2:~# kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146 \
--control-plane --certificate-key dd62cee26cdde801822734399730b6cbde061a6f45ba6171653b6b0d8ff94b1e \
--apiserver-advertise-address 172.16.16.102
root@kmaster2:~# mkdir -p $HOME/.kube
root@kmaster2:~# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
root@kmaster2:~# sudo chown $(id -u):$(id -g) $HOME/.kube/config
root@kmaster2:~# kubectl get nodes -owide
Run on master 2

Run on master 3

> ssh [email protected]
root@kmaster3:~# kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146 \
--control-plane --certificate-key dd62cee26cdde801822734399730b6cbde061a6f45ba6171653b6b0d8ff94b1e \
--apiserver-advertise-address 172.16.16.103
root@kmaster3:~# mkdir -p $HOME/.kube
root@kmaster3:~# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
root@kmaster3:~# sudo chown $(id -u):$(id -g) $HOME/.kube/config
root@kmaster3:~# kubectl get nodes -owide
Run on master 3

6. Join Worker Nodes

Run on worker 1,2,3

> ssh [email protected]
root@kworker1:~# kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146

> ssh [email protected]
root@kworker2:~# kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146

> ssh [email protected]
root@kworker3:~# kubeadm join 172.16.16.100:6443 --token gawviv.twjkghdivqba5ypk \
--discovery-token-ca-cert-hash sha256:35990a51e4f3669760ee947dcaecf27ef83d5ce63c5b8a14af06c174bf838146


root@kmaster3:~# kubectl get nodes -owide

7. Install Basic Kubernetes Components

Install Nginx Ingress Controller

> ssh [email protected]
root@kmaster3:~# kubectl get nodes

#Install Helm cli
root@kmaster3:~# curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
root@kmaster3:~# bash ./get_helm.sh

# Install Nginx Ingress Controller
root@kmaster3:~# helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
root@kmaster3:~# helm repo update
root@kmaster3:~# helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx --set controller.hostNetwork=true,controller.service.type="",controller.kind=DaemonSet --namespace ingress-nginx --version 4.10.1 --create-namespace --debug
root@kmaster3:~# kubectl -n ingress-nginx get all
root@kmaster3:~# kubectl get IngressClass
Nginx Ingress Controller

Install Longhorn StorageClass

# Install Longhorn sc
root@kmaster3:~# helm repo add longhorn https://charts.longhorn.io
root@kmaster3:~# helm repo update
root@kmaster3:~# helm upgrade --install longhorn longhorn/longhorn --set persistence.defaultClassReplicaCount=1 --namespace longhorn-system --create-namespace --version 1.6.1 --debug
root@kmaster3:~# kubectl -n longhorn-system get po
root@kmaster3:~# kubectl get sc
Install Longhorn StorageClass

Install Metric server

# Install Metric server
root@kmaster3:~# kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
root@kmaster3:~# kubectl patch deployment metrics-server -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'
root@kmaster3:~# kubectl top no
root@kmaster3:~# kubectl -n ingress-nginx top po
Install Metric server

8. Install Testing Tools

minio-values.yaml

auth:
rootUser: "admin"
rootPassword: "password"
ingress:
enabled: true
ingressClassName: "nginx"
hostname: minio.lakehouse.local
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: 1024m
defaultBuckets: "lakehouse, airflow, risingwave, kafka"
persistence:
size: 10Gi

Install Minio on Kubernetes

> ssh [email protected]
# Install Minio
root@kmaster3:~# helm repo add bitnami https://charts.bitnami.com/bitnami
root@kmaster3:~# helm repo update
root@kmaster3:~# helm upgrade --install minio -n minio -f minio-values.yaml bitnami/minio --create-namespace --debug

root@kmaster3:~# kubectl -n minio get po -owide
root@kmaster3:~# kubectl -n minio get pvc
root@kmaster3:~# kubectl -n minio get ing

# Update local hosts file
> sudo -i
> sudo cat >>/etc/hosts<<EOF
172.16.16.201 minio.lakehouse.local
EOF


# Access Minio at http://minio.lakehouse.local in your web browser.
# user/pass: admin/password

Access Minio at http://minio.lakehouse.local in your web browser with user: admin & pass: password

Minio on Kubernetes

9. Destroy Cluster

> vagrant destroy -f
Destroy Cluster