K8s Bare Metal
I know there are so many posts on how to configure a bare-metal K8s cluster, but after studying a lot, here’s my take on it.
Here we will be configuring a cluster composed by:
- 1 Master Node
- 4 Worker Nodes
- 1 NFS server so we can use storage over network
I’m provisioning my cluster on ESXi that’s running behind a PFSense, so traffic
can be controlled, and DHCP does not touch my main home network. I’ll also use
an internal DNS to route everything, and will use domains *.k.vito.sh
for
exposing services hosted in the cluster, and *.kube.vito.sh
to access VMS.
All machines will be provisioned with 8GB RAM and 4 cores. The network will look like this:
- Machine:
k8s-master
- IP:
10.0.11.0
- DNS:
master.kube.vito.sh
- IP:
- Machine:
k8s-worker1
- IP:
10.0.11.1
- DNS:
w1.kube.vito.sh
- IP:
- Machine:
k8s-worker2
- IP:
10.0.11.2
- DNS:
w2.kube.vito.sh
- IP:
- Machine:
k8s-worker3
- IP:
10.0.11.3
- DNS:
w3.kube.vito.sh
- IP:
- Machine:
k8s-worker4
- IP:
10.0.11.4
- DNS:
w4.kube.vito.sh
- IP:
- Machine:
k8s-nfs
- IP:
10.0.11.5
- DNS:
nfs.kube.vito.sh
- IP:
Provisioning Nodes
Warning: You will notice I’m running several commands without sudo. I usually just get a root shell using
sudo su
in order to execute everything without hassle.
All nodes except k8s-nfs
follow the same steps:
- Install Ubuntu 22.04 LTS. I use a minified server installation.
- Make sure everything is up-to-date:
apt update && apt upgrade -y && reboot
- Drop
snap
apt purge snapd
- Install utilities:
apt install iputils-ping vim apt-transport-https ca-certificates curl gnupg2 gpg software-properties-common -y
Install Cri-O
export OS=xUbuntu_22.04
export CRIO_VERSION=1.26
echo "deb https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/$OS/ /"| sudo tee /etc/apt/sources.list.d/devel:kubic:libcontainers:stable.list
echo "deb http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/$CRIO_VERSION/$OS/ /"|sudo tee /etc/apt/sources.list.d/devel:kubic:libcontainers:stable:cri-o:$CRIO_VERSION.list
curl -L https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable:cri-o:$CRIO_VERSION/$OS/Release.key | sudo apt-key add -
curl -L https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/$OS/Release.key | sudo apt-key add -
apt update && apt install cri-o cri-o-runc -y
systemctl enable --now crio
- Check Cri-O status:
systemctl status crio
Enable conmon
- Edit
/etc/crio/crio.conf
, find and uncomment the lineconmon = ""
- Insert
/usr/bin/conmon
between the quotes - Restart Cri-O:
systemctl restart crio
- Check its status:
systemctl status crio
Install Cri-O tools
apt install -y cri-tools
- Check everything is working:
crictl --runtime-endpoint unix:///var/run/crio/crio.sock version
- Check if it is ready:
crictl info
. It is expectedRuntimeReady
to betrue
Configure Networking
This is optional, but since I use an internal DNS, I need to set nameservers through netplan
:
- Edit
/etc/netplan/00-installer-config.yaml
and add the following line:nameservers.addresses: [10.0.1.3]
- Run
netplan apply
Prepare APT for K8s installation
- Make sure
/etc/apt/keyrings
exist. Create it if it does not:[ -d /etc/apt/keyrings ] || mkdir -p -m 755 /etc/apt/keyrings
- Download public signing key for the K8s packages:
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
- Add the appropriate
apt
repository:echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
- Install Kubernetes components:
apt update
apt install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl
- Enable and start
kubelet
:sudo systemctl enable --now kubelet
Update System Configuration for Kubernetes
- Enable required kernel modules:
modprobe overlay
modprobe br_netfilter
- Ensure modules are loaded on restart:
echo overlay | tee -a /etc/modules
echo br_netfilter | tee -a /etc/modules
- Update network configuration:
cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
- Apply system configuration:
sysctl --system
- Disable swap
systemctl mask swap.img.swap
systemctl stop swap.img.swap
swapoff -a
- Update
kubeadm
configuration to usesystemd
as itscgroup
driver:- Create
/etc/default/kubelet
with the following contents:
- Create
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd --runtime-request-timeout=5m
- Apply changes once again
systemctl daemon-reload
systemctl restart kubelet
At this point, kubelet
will be crashing, waiting for kubeadm
instructions.
Bootstrapping the Cluster
The next step must be performed only on the master
node. This will bootstrap
the cluster. Other nodes will join in another step.
To bootstrap the cluster, use kubeadm
:
kubeadm init --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint=master.kube.vito.sh
The command may take a while and will output instructions on how to allow non-root access to
kubectl
and how to join nodes to the cluster. Take note of those instructions!
Exit the root shell, and go back to your non-root user. Then:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Finally, run kubectl get ns
. If you get a response, congratulations, you have
a working master node!
Setup Flannel
Flannel is a CNI plugin that allows networking to work on your cluster. More information can be found here.
Installation is straightforward, requiring a single command:
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
Then you can use watch kubectl get pods --all-namespaces
and wait for Flannel to bootup.
Setup Workers
Now, SSH into each of the workers and using information emitted by kubeadm, join them to the cluster. The command will look like the following, and must be executed with root privileges:
kubeadm join master.kube.vito.sh:6443 --token qn2gjj.e6y272oiqhqa2emu \
--discovery-token-ca-cert-hash sha256:401e68dbd991b30bcad71ed2cc67c03c5e970e60ac6eb21db43679104431e678
Label Workers
Login to the master
node again, and check all nodes are visible using kubectl get nodes
. Notice that wokers have no role, meaning they won’t pick containers to run. Label them as workers:
kubectl label node k8s-worker1 node-role.kubernetes.io/worker=worker
kubectl label node k8s-worker2 node-role.kubernetes.io/worker=worker
kubectl label node k8s-worker3 node-role.kubernetes.io/worker=worker
kubectl label node k8s-worker4 node-role.kubernetes.io/worker=worker
Bootstrapping the Cluster: Part 2
Install MetalLB
MetalLB allows bare-metal clusters to have ingresses and load balancers without requiring a cloud-provider.
The first step to install it is to enable strictARP
in kube-proxy
. This can be done by running
the command below:
kubectl edit configmap kube-proxy -n kube-system
Locate the line strictARP:
and change its value from false
to true
.
Installing it is also simple:
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.9/config/manifests/metallb-native.yaml
Finally, it needs to be configured in order to know which IPs it may use. In my case, the following is enough:
# metallb-ippool.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: default
namespace: metallb-system
spec:
addresses:
- 10.0.11.50-10.0.11.254
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: default
namespace: metallb-system
Apply the manifest using kubectl apply
:
kubectl apply -f metallb-ippool.yaml
Install k8s_gateway
k8s_gateway exposes a DNS server that is able to resolve ingresses to nodes IPs. Installation is made through helm:
helm repo add k8s_gateway https://ori-edge.github.io/k8s_gateway/
helm install exdns --set domain=k.vito.sh --set service.loadBalancerIP=10.0.11.53 k8s_gateway/k8s-gateway
I want to expose 10.0.11.53
as the DNS service. My upstream DNS server forwards requests to k.vito.sh
to this IP.
Install nginx-ingress
nginx-ingress
is also installed through helm:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
Apply the chart using the following command:
helm install ingress-nginx ingress-nginx/ingress-nginx -n ingress-nginx --create-namespace
Test the ingress and networking
The following manifest contains a deployment, service, and ingress that shows NGiNX’s default welcome message:
# test.yaml
apiVersion: v1
kind: Namespace
metadata:
name: test-ingress
---
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: test-ingress
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: test-ingress
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
namespace: test-ingress
spec:
ingressClassName: "nginx"
rules:
- host: test.k.vito.sh
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-service
port:
number: 80
Apply it using kubectl apply -f test.yaml
. Eventually, the page will be
available over test.k.vito.sh
.
Enabling support for PersistentVolumeClaims
For this part, a new machine will be provisioned. I’ll use alpine as it is ultra lightweight, performs well, and installs even faster. I’ll configure it with 4GB RAM, 4 CPUs, and 100GB of disk.
Once the VM is provisioned, let’s configure NFS on it:
First, install required packages:
apk update && apk add nfs-utils vim
Create the data directory:
mkdir /data
Update its owner to nobody:nogroup
:
chown nobody:nogroup /data
And configure /etc/exports
just like in any other Linux server. It is
important to notice that each node that will be given access to the NFS server
must be included in the /etc/exports
:
/data 10.0.11.1(rw,sync,no_subtree_check) 10.0.11.2(rw,sync,no_subtree_check) 10.0.11.3(rw,sync,no_subtree_check) 10.0.11.4(rw,sync,no_subtree_check)
After saving the file, reload settings:
exportfs -afv
And ensure to start the NFS service on boot:
rc-update add nfs
rc-service nfs start
Preparing Workers
Now, on each worker, install nfs-common
through APT:
apt install -y nfs-common
And finally, install the provisioner through Helm:
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--create-namespace \
--namespace nfs-system \
--set nfs.server=nfs.kube.vito.sh \
--set nfs.path=/data
Then, try to create a PVC:
# pvc-test.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-test
namespace: default
spec:
accessModes:
- ReadWriteMany
storageClassName: nfs-client
resources:
requests:
storage: 20Gi
kubectl apply -f pvc-test.yaml
Reading the PVC status, everything should be good:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
pvc-test Bound pvc-ff3b4825-3836-4661-9958-c2b8948135af 20Gi RWX nfs-client <unset> 7s
TLS & Other Trinkets
What’s missing now is an Argo CD instance, and have automatic TLS for our services. This will be covered in the future!
Happy K8sing!