Archive February 4, 2023

How to Install Kubernetes Cluster on Ubuntu 22.04 with ZFS

Are you looking for an easy guide on how to install Kubernetes Cluster on Ubuntu 22.04 (Jammy Jellyfish)?

The step-by-step guide on this page will show you how to install Kubernetes cluster on Ubuntu 22.04 using Kubeadm command step by step.

Kubernetes is a free and open-source container orchestration tool, it also known as k8s. With the help of Kubernetes, we can achieve automated deployment, scaling and management of containerized application.

A Kubernetes cluster consists of worker nodes on which application workload is deployed and a set up master nodes which are used to manage worker nodes and pods in the cluster.

In this guide, we are using one master node and two worker nodes. Following are system requirements on each node,

  • Minimal install Ubuntu 22.04
  • Minimum 2GB RAM or more
  • Minimum 2 CPU cores / or 2 vCPU
  • 20 GB free disk space on /var or more
  • Sudo user with admin rights
  • Internet connectivity on each node

Lab Setup

  • Master Node:  192.168.1.173 – k8smaster.example.net
  • First Worker Node:  192.168.1.174 – k8sworker1.example.net
  • Second Worker Node:  192.168.1.175 – k8sworker2.example.net

Without any delay, let’s jump into the installation steps of Kubernetes cluster

Step 1) Set hostname and add entries in the hosts file

Login to to master node and set hostname using hostnamectl command,

sudo hostnamectl set-hostname "k8smaster.example.net"

On the worker nodes, run

sudo hostnamectl set-hostname "k8sworker1.example.net"
sudo hostnamectl set-hostname "k8sworker2.example.net"

Add the following entries in /etc/hosts file on each node

192.168.1.173   k8smaster.example.net k8smaster
192.168.1.174   k8sworker1.example.net k8sworker1
192.168.1.175   k8sworker2.example.net k8sworker2

Step 2) Disable swap & add kernel settings

Execute beneath swapoff and sed command to disable swap. Make sure to run the following commands on all the nodes.

sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

Load the following kernel modules on all the nodes,

sudo tee /etc/modules-load.d/containerd.conf <<EOF
br_netfilter
EOF
sudo modprobe br_netfilter

Set the following Kernel parameters for Kubernetes, run beneath tee command

sudo tee /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF 

Reload the above changes, run

sudo sysctl --system

Step 3) Install containerd run time

In this guide, we are using containerd run time for our Kubernetes cluster. So, to install containerd, first install its dependencies.

sudo apt install -y curl gnupg2 software-properties-common apt-transport-https ca-certificates

Enable docker repository

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/docker.gpg
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

Now, run following apt command to install containerd

sudo apt update
sudo apt install -y containerd.io

Configure containerd so that it starts using systemd as cgroup.

containerd config default | sudo tee /etc/containerd/config.toml >/dev/null 2>&1
sudo sed -i 's/SystemdCgroup \= false/SystemdCgroup \= true/g' /etc/containerd/config.toml
sudo sed -i 's/snapshotter \= "overlayfs"/snapshotter \= "zfs"/g' /etc/containerd/config.toml

You will now need to create a zpool to use as the snapshotter for containerd. If you create this in the default path everything should work with the config created above, but you might need to set the path for the zfs snapshotter if you want a different path.

sudo zfs create -o mountpoint=/var/lib/containerd/io.containerd.snapshotter.v1.zfs <your zfs pool>/containerd

Restart and enable containerd service

sudo systemctl restart containerd
sudo systemctl enable containerd

Step 4) Add apt repository for Kubernetes

Execute following commands to add apt repository for Kubernetes

sudo curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/google.gpg
sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"

Note: At time of writing this guide, Xenial is the latest Kubernetes repository but when repository is available for Ubuntu 22.04 (Jammy Jellyfish) then you need replace xenial word with ‘jammy’ in ‘apt-add-repository’ command.

Step 5) Install Kubernetes components Kubectl, kubeadm & kubelet

Install Kubernetes components like kubectl, kubelet and Kubeadm utility on all the nodes. Run following set of commands,

sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Step 6) Initialize Kubernetes cluster with Kubeadm command

Now, we are all set to initialize Kubernetes cluster. Run the following Kubeadm command from the master node only.

sudo kubeadm init --control-plane-endpoint=k8smaster.example.net

Output of above command should end with something like the following,

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.0.42:6443 --token vt4ua6.23wer232423134 \
        --discovery-token-ca-cert-hash sha256:3a2c36feedd14cff3ae835abcdefgesadf235adca0369534e938ccb307ba5

As the output above confirms that control-plane has been initialize successfully. In output also we are getting set of commands for interacting the cluster and also the command for worker node to join the cluster.

So, to start interacting with cluster, run following commands from the master node,

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Now, try to run following kubectl commands to view cluster and node status

kubectl cluster-info
kubectl get nodes

Output,

user@server:~ $ kubectl cluster-info
Kubernetes control plane is running at https://10.0.0.42:6443
CoreDNS is running at https://10.0.0.42:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
user@server:~ $ kubectl get nodes
NAME         STATUS   ROLES           AGE    VERSION
k8smaster   Ready    control-plane   153m   v1.26.1

If you only want to have one node you can run the following to allow scheduling on the master

kubectl taint node k8smaster node-role.kubernetes.io/master:NoSchedule-
kubectl taint nodes --all node-role.kubernetes.io/master-
kubectl taint nodes --all  node-role.kubernetes.io/control-plane-

Join both the worker nodes to the cluster, command is already there is output, just copy paste on the worker nodes,

sudo kubeadm join k8smaster.example.net:6443 --token vt4ua6.23wer232423134 \
   --discovery-token-ca-cert-hash sha256:3a2c36feedd14cff3ae835abcdefgesadf235adca0369534e938ccb307ba5

Output from both the worker nodes,

Check the nodes status from master node using kubectl command,

kubectl get nodes
Node-Status-K8s-Before-CNI

As we can see nodes status is ‘NotReady’, so to make it active. We must install CNI (Container Network Interface) or network add-on plugins like Calico, Flannel and Weave-net.

Step 6) Install Calico Pod Network Add-on

Run following curl and kubectl command to install Calico network plugin from the master node,

curl https://projectcalico.docs.tigera.io/manifests/calico.yaml -O
kubectl apply -f calico.yaml

Output of above commands would look like below,

Install-Calico-Network-Add-on-k8s

Verify the status of pods in kube-system namespace,

kubectl get pods -n kube-system

Output,

Kube-System-Pods-after-calico-installation

Perfect, check the nodes status as well.

kubectl get nodes
Nodes-Status-after-Calico-Network-Add-on

Great, above confirms that nodes are active node. Now, we can say that our Kubernetes cluster is functional.

Step 7) Test Kubernetes Installation

To test Kubernetes installation, let’s try to deploy nginx based application and try to access it.

kubectl create deployment nginx-app --image=nginx --replicas=2

Check the status of nginx-app deployment

kubectl get deployment nginx-app
NAME        READY   UP-TO-DATE   AVAILABLE   AGE
nginx-app   2/2     2            2           68s

Expose the deployment as NodePort,

kubectl expose deployment nginx-app --type=NodePort --port=80
service/nginx-app exposed

Run following commands to view service status

kubectl get svc nginx-app
kubectl describe svc nginx-app

Output of above commands,

Deployment-Service-Status-k8s

Use following command to access nginx based application,

curl http://<woker-node-ip-addres>:31246
curl http://192.168.1.174:31246

Output,

Curl-Command-Access-Nginx-Kubernetes

Great, above output confirms that nginx based application is accessible.

That’s all from this guide, I hope you have found this guide useful. Most of this post comes from https://www.linuxtechi.com/install-kubernetes-on-ubuntu-22-04/ with modifications to work with ZFS.

Resolving Oracle Cloud “Out of Capacity” issue and getting free VPS with 4 ARM cores / 24GB of memory (using OCI CLI)

Very neat and useful configuration was recently announced at Oracle Cloud Infrastructure (OCI) blog as a part of Always Free tier. Unfortunately, as of July 2021, it’s still very complicated to launch an instance due to the “Out of Capacity” error. Here we’re solving that issue as Oracle constantly adds capacity from time to time.

Each tenancy gets the first 3,000 OCPU hours and 18,000 GB hours per month for free to create Ampere A1 Compute instances using the VM.Standard.A1.Flex shape (equivalent to 4 OCPUs and 24 GB of memory).

Starting from Oracle Cloud Infrastructure (OCI) CLI installation.

Quickstart

The installer script automatically installs the CLI and its dependencies, Python and virtualenv. Before running the…

docs.oracle.com

On a Mac Computer you can also install the OCI cli with Brew.

brew install oci-cli jq

Generating API key

After logging in to OCI Console, click profile icon and then “User Settings”

Go to Resources -> API keys, click “Add API Key” button

Add API Key

Make sure “Generate API Key Pair” radio button is selected, click “Download Private Key” and then “Add”.

Download Private Key

Copy the contents from textarea and save it to file with a name “config”. I put it together with *.pem file in newly created directory $HOME/.oci

That’s all about the API key generation part.

Setting up CLI

Specify config location

OCI_CLI_RC_FILE=$HOME/.oci/config

If you haven’t added OCI CLI binary to your PATH, run

alias oci="$HOME/bin/oci"

(or whatever path it was installed).

Set permissions for the private key

oci setup repair-file-permissions --file $HOME/.oci/oracleidentitycloudservice***.pem

Test the authentication (user value should be taken from textarea when generating API key):

oci iam user get --user-id ocid1.user.oc1..aaaaaaaaa***123

Output should be similar to:

{
  "data": {
    "capabilities": {
      "can-use-api-keys": true,
      "can-use-auth-tokens": true,
      "can-use-console-password": false,
      "can-use-customer-secret-keys": true,
      "can-use-db-credentials": true,
      "can-use-o-auth2-client-credentials": true,
      "can-use-smtp-credentials": true
    },
    "compartment-id": "ocid1.tenancy.oc1..aaaaaaaa***123",
    "db-user-name": null,
    "defined-tags": {
      "Oracle-Tags": {
        "CreatedBy": "scim-service",
        "CreatedOn": "2021-08-31T21:03:23.374Z"
      }
    },
    "description": "[email protected]",
    "email": null,
    "email-verified": true,
    "external-identifier": "123456789qwertyuiopas",
    "freeform-tags": {},
    "id": "ocid1.user.oc1..aaaaaaaaa***123",
    "identity-provider-id": "ocid1.saml2idp.oc1..aaaaaaaae***123",
    "inactive-status": null,
    "is-mfa-activated": false,
    "last-successful-login-time": null,
    "lifecycle-state": "ACTIVE",
    "name": "oracleidentitycloudservice/[email protected]",
    "previous-successful-login-time": null,
    "time-created": "2021-08-31T21:03:23.403000+00:00"
  },
  "etag": "121345678abcdefghijklmnop"
}

Acquiring launch instance params

We need to know which Availability Domain is always free. Click Oracle Cloud menu -> Compute -> Instances

Instances

Click “Create Instance” and notice which one has “Always Free Eligible” label in Placement Section. In our case it’s AD-2.

Almost every command needs compartment-id param to be set. Let’s save it to COMPARTMENT var (replace with your “tenancy” value from the config file) then save the following under ~/bin/launch-instance:

#!/bin/bash -x
SSH_PUB_KEY_FILE="$HOME/.ssh/id_rsa.pub"
SSH_KEY=$(cat $SSH_PUB_KEY_FILE)
OCI_CLI="/opt/homebrew/bin/oci"
JQ="/opt/homebrew/bin/jq"
FLAG_FILE="$HOME/.oci/success"
if [[ -f "$FLAG_FILE" ]]; then
  echo "Already deployed!"
  exit 0
fi
# ARM
SHAPE=VM.Standard.A1.Flex

COMPARTMENT=ocid1.tenancy.oc1..aaaaaaaa**123

# Setup the oci profile using the following command:
#   oci session authenticate --region us-ashburn-1
HC="TEST"
PROFILE=DEFAULT
DISPLAY_NAME="$USER-$( date -I )-$RANDOM"
mkdir -p $HOME/.oci/hosts/
INSTANCE_INFO_FILE="$HOME/.oci/hosts/$DISPLAY_NAME"
AUTH_PARAMS="--profile $PROFILE"
#AUTH_PARAMS="--profile $PROFILE --auth security_token"

AD=$($OCI_CLI iam availability-domain $AUTH_PARAMS list --compartment-id $COMPARTMENT | $JQ -r ".data| .[0].name")
if [[ $? != 0 ]]; then
   echo "Could not determine AD.  You might need to reauthenticate"
   echo "oci session authenticate --region us-ashburn-1 $AUTH_PARAMS"
   exit 1
fi

SUBNET=$($OCI_CLI network subnet $AUTH_PARAMS list --compartment-id $COMPARTMENT | $JQ -r ".data| .[0].id")
if [[ $? != 0 ]]; then
   echo "Could not determine Subnet"
   exit 1
fi

IMAGE=$($OCI_CLI compute image $AUTH_PARAMS list --compartment-id=$COMPARTMENT --shape=$SHAPE | $JQ -r '[ .data[] | select(."operating-system" == "Oracle Linux") | select(."operating-system-version"|startswith("8"))] | .[0].id')
if [[ $? != 0 ]]; then
   echo "Could not determine Image"
   exit 1
fi
# export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-bundle.crt
OCI_INFO=$($OCI_CLI $AUTH_PARAMS compute instance launch --shape $SHAPE \
   --availability-domain $AD \
   --compartment-id $COMPARTMENT \
   --image-id $IMAGE \
   --display-name $DISPLAY_NAME \
   --metadata "{ \"hostclass\": \"$HC\" }" \
   --subnet-id $SUBNET --shape-config "{ \"memoryInGBs\": 24.0, \"ocpus\": 4.0 }" \
   --ssh-authorized-keys-file $SSH_PUB_KEY_FILE \
)
if [[ $? != 0 ]]; then
   echo "Failed to deploy"
   exit 1
fi
echo $OCI_INFO > $INSTANCE_INFO_FILE
INSTANCE_ID=$($JQ -r '.data.id' < $INSTANCE_INFO_FILE)
if [[ -z $INSTANCE_ID ]]; then
   echo "Faild to read instance info from the file"
   exit 1
fi
while [[ -z "$INSTANCE_IP" ]]; do
  echo "Waiting 10s for the ip to be availible"
  sleep 10s
  INSTANCE_IP=$($OCI_CLI $AUTH_PARAMS compute instance list-vnics --instance-id $INSTANCE_ID | $JQ -r '.data[]."public-ip"')
done
if [[ ! -z $INSTANCE_IP ]]; then
echo "Updating the SSH config to include $DISPLAY_NAME"
cat >> ~/.ssh/config.d/custom < $FLAG_FILE

You can now setup crontab to run this script e.g. every minute by, saving this to $HOME/bin/launch-instance as a script file and making sure cron user is able to access private key. Some of the variables in the script might need to be updated to match your system. We won’t cover this part.

Output:

{
  "data": {
    "agent-config": {
      "are-all-plugins-disabled": false,
      "is-management-disabled": false,
      "is-monitoring-disabled": false,
      "plugins-config": null
    },
    "availability-config": {
      "is-live-migration-preferred": null,
      "recovery-action": "RESTORE_INSTANCE"
    },
    "availability-domain": "RCFH:US-ASHBURN-AD-1",
    "capacity-reservation-id": null,
    "compartment-id": "ocid1.compartment.oc1..aaaaaaaa***123",
    "dedicated-vm-host-id": null,
    "defined-tags": {},
    "display-name": "user-2023-01-30-22601",
    "extended-metadata": {},
    "fault-domain": "FAULT-DOMAIN-3",
    "freeform-tags": {},
    "id": "ocid1.instance.oc1.iad.adsfasfasdfasdfasdfasdf12323dfsag234",
    "image-id": "ocid1.image.oc1.iad.aaaaaaaa**123",
    "instance-options": {
      "are-legacy-imds-endpoints-disabled": false
    },
    "ipxe-script": null,
    "launch-mode": "PARAVIRTUALIZED",
    "launch-options": {
      "boot-volume-type": "PARAVIRTUALIZED",
      "firmware": "UEFI_64",
      "is-consistent-volume-naming-enabled": true,
      "is-pv-encryption-in-transit-enabled": false,
      "network-type": "PARAVIRTUALIZED",
      "remote-data-volume-type": "PARAVIRTUALIZED"
    },
    "lifecycle-state": "PROVISIONING",
    "metadata": {
      "hostclass": "Your-Hostclass",
      "ssh_authorized_keys": "ssh-rsa AAAAB3123432412343241234324123432412343241234324123432412343241234324123432412343241234324123432412343241234324123432412343241234324123432412343241234324/71ctthb1Ek= your-ssh-key"
    },
    "platform-config": null,
    "preemptible-instance-config": null,
    "region": "iad",
    "shape": "VM.Standard.A1.Flex",
    "shape-config": {
      "baseline-ocpu-utilization": null,
      "gpu-description": null,
      "gpus": 0,
      "local-disk-description": null,
      "local-disks": 0,
      "local-disks-total-size-in-gbs": null,
      "max-vnic-attachments": 2,
      "memory-in-gbs": 6.0,
      "networking-bandwidth-in-gbps": 1.0,
      "ocpus": 1.0,
      "processor-description": "3.0 GHz Ampere® Altra™"
    },
    "source-details": {
      "boot-volume-size-in-gbs": null,
      "boot-volume-vpus-per-gb": null,
      "image-id": "ocid1.image.oc1.iad.aaaaaaaas***123",
      "kms-key-id": null,
      "source-type": "image"
    },
    "system-tags": {},
    "time-created": "2023-01-30T19:13:44.584000+00:00",
    "time-maintenance-reboot-due": null
  },
  "etag": "123456789123456789123456789123456789123456789",
  "opc-work-request-id": "ocid1.coreservicesworkrequest.oc1.iad.abcd***123"
}

I believe it’s pretty safe to leave the cron running and check cloud console once per few days. Because when you’ll succeed, usually you won’t be able to create more instances than allowed — but start getting something like

{
    "code": "LimitExceeded",
    "message": "The following service limits were exceeded: standard-a1-memory-count, standard-a1-core-count. Request a service limit increase from the service limits page in the console. "
}

or (again)

{
    "code": "InternalError",
    "message": "Out of host capacity."
}

At least that’s how it worked for me. Just in case the script writes a file when it successfully deploys an instance. If the file is in place the script will not run again.

To verify the instance you can run the following.

oci compute instance list --compartment-id $C

You could also add something to check it’s output periodically to know when cron needs to be disabled. That’s not related to our issue here.

Assigning public IP address

We are not doing this during the command run due to the default limitation (2 ephemeral addresses per compartment). That’s how you can achieve this. When you’ll succeed with creating an instance, open OCI Console, go to Instance Details -> Resources -> Attached VNICs by selecting it’s name

VNICs

Then Resources -> IPv4 Addresses -> … -> Edit

IPv4 Addresses

Choose ephemeral and click “Update”

Edit IP address

Conclusion

That’s how you will login when instance will be created (notice opc default username)

ssh -i ~/.ssh/id_rsa [email protected]

If you didn’t assign public IP, you can still copy internal FQDN or private IP (10.x.x.x) from the instance details page and connect from your other instance in the same VNIC. e.g.

ssh -i ~/.ssh/id_rsa [email protected]

Thanks for reading!

Copyright © 2018 tpmullan.com. All right reserved