Food for thought

25 May 2020

Highly available Kubernetes with batteries for small business

Kindie (Kubernetes Individual) is an opinionated Kubernetes cluster setup for individuals or small business. Batteries included so that you can hit the ground running and add production workload in no time.

Target audience

Sysadmin, DevOps, Cloud engineer with Linux and Kubernetes experience looking to build a Kubernetes cluster for production usage with bells and whistles focussed on web workloads. You should be able to have the cluster ready in a few hours. If you don’t understand some of the information here, please comment below or research it on the internet. This guide is not meant for complete beginners but we try to keep it as accessible as possible without going into too much details.



Feel free to change the setup as you wish but you’re on your own. Eventhough we claim this is production ready for ourselves, it might not be for you. So adjust and test this setup further until you are satisfied. We deliberately use root user instead of sudo to save time. And because we know what we are doing (most of the time).

Hardware specifications

Small business Kubernetes cluster

  • a router with uplink
  • Synology DS918+ with 16GB memory and 4TB of storage capacity
  • UPS for data safety
  • 2 NUC with 100 GB disk storage and 16 GB memory each
  • access to manage a domain (
  • Ubuntu server 20.04 ISO downloaded and on USB stick to install the NUC’s


To give you a birds-eye view of what you’re about to build.



The core router serves the internal network This is inline with default networks in public cloud services like AWS VPC’s. There’s plenty of room to expand your cluster and you will probably never use all the allocatable addresses here anyway. Of this range we have the following static addresses:

  • => Gateway address on the router
  • => Synology
  • => Floating IP assigned to keepalived master. This address is highly available and therefore used for cluster endpoint of Kubernetes API server and HTTP(s) ingress into the cluster
  • => node1 (this is a Virtual Machine (VM) running in Synology)
  • => node2
  • => node3
  • => range reserved for internal loadbalancers (Metal LB)

There’s also an optional UPS supporting the core of the system: router + synology. Synology also exposes the NFS so that nodes can use it as central storage.



Above merely shows that there are 3 master nodes and N worker nodes where N is larger or equal to zero. Each node will run an ingress controller for HA. In this setup we untaint the master nodes so that regular workloads can be scheduled on them; therefore treat them like worker nodes.



The batteries included are split up in 2 namespaces:

  • sys: internal misc services needed to support apps; sort of like shared infra services
  • monitoring: everything related to monitoring


  • Configure router to have as internal network: and create the port forward rules as described in the Network Architecture diagram.
  • Create a DNS record of type A: =>
  • Create a wildcard DNS record of type A (or CNAME if you want): * => YOUR_PUBLIC_IP
  • Create another wildcard DNS record of type A (or CNAME if you want): * => YOUR_PUBLIC_IP
  • Setup your Synology and set the address to
  • Setup Synology to allow NFS mounts from

Synology NFS permission

  • Create a VM in Synology called node1 with 7GB RAM and 100GB disk, install ubuntu-server:
    • set manual IP to
    • set hostname to node1
    • install OpenSSH
    • create user ops
  • install all your other physical/dedicated nodes as above (obviously use for node2, for node3, etc…)

Kubernetes Cluster

At this point you have 3 nodes running: node1, node2 and node3. Because the first 3 nodes are master nodes, we will prepare them all with keepalived and kubeadm. For each node login over SSH to it using the ops username and password you used during installation. After you login switch to root user with sudo su and enter your password again.


apt install -y keepalived

Create a file /etc/keepalived/keepalived.conf with the content:

vrrp_instance VI_1 {
    state MASTER
    interface ens3
    virtual_router_id 101
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass RANDOM_STRING_HERE
    virtual_ipaddress {

Replace RANDOM_STRING_HERE with a strong password of your choice if you want (since this is internal network this is not a very big deal).

It is however necessary to set the correct interface name. You can find it with ip a.

After that we can wrap up with:

systemctl enable keepalived
systemctl start keepalived

We use the same keepalived.conf for all master nodes so that the active master is randomly selected. Feel free to adjust the priority if desired to influence the preference.


We will use the official installation guide to install Kubernetes:


cat > /etc/modules-load.d/containerd.conf <<EOF

modprobe overlay
modprobe br_netfilter

# Setup required sysctl params, these persist across reboots.
cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1

sysctl --system

curl -fsSL | apt-key add -
add-apt-repository \
    "deb [arch=amd64] \
    $(lsb_release -cs) \
apt-get update && apt-get install -y
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
systemctl restart containerd

Kubeadm, kubelet, kubectl

apt-get update && apt-get install -y apt-transport-https curl
curl -s |  apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb kubernetes-xenial main
apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

nfs utils

Because we want to be able to mount NFS shares as PVC.

apt install -y nfs-common


To install our first master node on node1, we first turn off keepalived on node2 and node3:

ssh ops@ 'systemctl stop keepalived'
ssh ops@ 'systemctl stop keepalived'

Now on node1 you can confirm it has the IP

ip a | grep ''

And confirm your DNS record is set correctly:

host has address

After that we are ready to continue:

kubeadm init --apiserver-advertise-address=$(hostname -I | cut -d " " -f1) --upload-certs

Replace the endpoint address.

After a while you will be greeted with a message similar to:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join --token XXXX.XXXX \
    --discovery-token-ca-cert-hash sha256:XXXX \
    --control-plane --certificate-key XXXX

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join --token XXXX.XXXX \
    --discovery-token-ca-cert-hash sha256:XXXX

node2 and node3

To install node2 and node3, login to the node as ops and switch to root then execute:

  kubeadm join --token XXXX.XXXX \
    --discovery-token-ca-cert-hash sha256:XXXX \
    --control-plane --certificate-key XXXX

(Obviously, replace the values)

Confirm nodes

On node1 as root execute:

export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl get nodes

It should output something similar to:

node1   Ready    master   3d2h    v1.18.3
node2   Ready    master   3d2h    v1.18.3
node3   Ready    master   5h46m   v1.18.3

Let’s untaint the master nodes:

kubectl taint nodes --all

CNI (network)

If you do kubectl get pods -A you will see coredns is not starting up correctly:

root@node1:/home/ops# kubectl get pods -A
NAMESPACE     NAME                            READY   STATUS    RESTARTS   AGE
kube-system   coredns-66bff467f8-2bqht        0/1     Pending   0          7m15s
kube-system   coredns-66bff467f8-l7pbt        0/1     Pending   0          7m15s

To fix that we need to install a CNI plugin, we choose Calico:

kubectl apply -f

After a while coredns is running:

root@node1:/home/ops# kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-789f6df884-b8tsg   1/1     Running   0          4m58s
kube-system   calico-node-9fgqj                          1/1     Running   0          4m59s
kube-system   coredns-66bff467f8-2bqht                   1/1     Running   0          13m
kube-system   coredns-66bff467f8-l7pbt                   1/1     Running   0          13m

Smoke test

To smoke test we can run a job:

root@node1:/home/ops# kubectl run --rm=true -i --tty busybox --image=busybox --restart=Never -- ps
If you don't see a command prompt, try pressing enter.
Error attaching, falling back to logs: unable to upgrade connection: container busybox not found in pod busybox_default
   1 root      0:00 ps
pod "busybox" deleted

If you do not get output of ps something is broken.

Highly available test

So now we have 3 master nodes running in our cluster. We can test the high availability of the API server. To do that first we need to bring up keepalived on node2 and node3:

ssh ops@ 'systemctl start keepalived'
ssh ops@ 'systemctl start keepalived'

You will notice that node1 currently owns the master IP. Let’s copy the kubeconfig from the node1 to your local machine:

ssh ops@ 'sudo cat /etc/kubernetes/admin.conf' >> ~/.kube/config

Now you should be able to execute kubectl commands from your local machine. Do for instance:

kubectl get nodes
node1   Ready    master   3d2h    v1.18.3
node2   Ready    master   3d2h    v1.18.3
node3   Ready    master   5h46m   v1.18.3

Now if you reboot node1, the master IP is automatically taken over by another node. Therefore kubectl commands still work while node1 is being rebooted. As an excercise, find which failover node has the master IP.


Now that we have a kubernetes cluster running with 3 masters and a Highly available endpoint for the API server we can continue to setup the services. From now on you can interact with the Kubernetes cluster from your local machine.

Namespace: sys

kubectl create namespace sys

Metal LB

helm repo add stable
helm repo update

cat > metallb-config.yaml <<EOF
apiVersion: v1
kind: ConfigMap
  namespace: sys
  name: metallb-config
  config: |
    - name: default
      protocol: layer2
kubectl apply -f metallb-config.yaml
helm install metallb stable/metallb --namespace sys

See the metallb helm chart for full configuration options.


cat > nginx-ingress-values.yaml <<EOF
  kind: DaemonSet
    useHostPort: true
      http: 30080
      https: 30443
    enabled: false
    enabled: true
      annotations: "true" "10254"

    repository: cinaq/default-backend
    tag: 1.2
  replicaCount: 2
helm install nginx-ingress stable/nginx-ingress --namespace sys -f nginx-ingress-values.yaml

See the nginx-ingress helm chart for full configuration options.

Cert-manager (Letsencrypt)

helm repo add jetstack
helm repo update
kubectl apply --validate=false -f
helm install cert-manager jetstack/cert-manager --namespace sys --version v0.14.2

cat > issuer_letsencrypt.yaml <<EOF
kind: ClusterIssuer
  name: letsencrypt
  namespace: sys
    # The ACME server URL
    # Email address used for ACME registration
    # Name of a secret used to store the ACME account private key
      name: letsencrypt
    # Enable the HTTP-01 challenge provider
    - http01:
          class:  nginx
kubectl create -f issuer_letsencrypt.yaml

See the cert-manager helm chart for full configuration options.

NFS client provisioner

helm install nfs-storage stable/nfs-client-provisioner --namespace sys --set nfs.server= --set nfs.path=/volume1/kubernetes
kubectl patch storageclass nfs-client -p '{"metadata": {"annotations":{"":"true"}}

See the nfs-server-provisioner helm chart for full configuration options.

Namespace: monitoring

kubectl create namespace monitoring


cat > prometheus-values.yaml <<EOF
  replicaCount: 2
  replicaCount: 2
  replicaCount: 2
    enabled: true
helm install prometheus stable/prometheus -n monitoring -f prometheus-values.yaml

See the prometheus helm chart for full configuration options.


helm repo add loki
helm repo update

helm install loki loki/loki-stack -n monitoring

See the loki-stack helm chart for full configuration options.


cat > grafana-values.yaml <<EOF
  enabled: true
replicas: 2
helm install grafana stable/grafana -n monitoring -f grafana-values.yaml

After the helm install, save the grafana password for later.

See the grafana helm chart for full configuration options.

Expose grafana via Ingress:

cat > grafana-resources.yaml <<EOF
apiVersion: extensions/v1beta1
kind: Ingress
  name: grafana-ingress
  namespace: monitoring
  annotations: "letsencrypt" 1m |
      # IP white-listing
      allow YOUR_PUBLIC_IP;
      deny all;
  - hosts:
    secretName: dev-grafana-sys-grafana-tls
  - host:
      - path: /
          serviceName: grafana
          servicePort: 80

Now you should be able to visit grafana via the public URL: and notice you get redirected automatically to HTTPS and it’s signed by LetsEncrypt.

Login here with the password from grafana installation with username admin.

After login configure the 2 datasources:

  • loki: http://loki:3100
  • prometheus: http://prometheus-server

Then import the dashboards:


Your deployment is now complete. It should look like:

$ kubectl get pods -A -o wide
NAMESPACE     NAME                                                 READY   STATUS    RESTARTS   AGE     IP                NODE    NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-789f6df884-g2gm4             1/1     Running   2          3d2h   node1   <none>           <none>
kube-system   calico-node-8nv5r                                    1/1     Running   1          6h27m          node3   <none>           <none>
kube-system   calico-node-srxdd                                    1/1     Running   3          3d2h          node1   <none>           <none>
kube-system   calico-node-tslz8                                    1/1     Running   6          3d2h          node2   <none>           <none>
kube-system   coredns-66bff467f8-6dsk5                             1/1     Running   1          5h24m    node3   <none>           <none>
kube-system   coredns-66bff467f8-z2b9h                             1/1     Running   2          3d2h   node1   <none>           <none>
kube-system   etcd-node1                                           1/1     Running   3          3d2h          node1   <none>           <none>
kube-system   etcd-node2                                           1/1     Running   7          3d2h          node2   <none>           <none>
kube-system   etcd-node3                                           1/1     Running   1          6h27m          node3   <none>           <none>
kube-system   kube-apiserver-node1                                 1/1     Running   8          3d2h          node1   <none>           <none>
kube-system   kube-apiserver-node2                                 1/1     Running   9          3d2h          node2   <none>           <none>
kube-system   kube-apiserver-node3                                 1/1     Running   1          6h27m          node3   <none>           <none>
kube-system   kube-controller-manager-node1                        1/1     Running   35         3d2h          node1   <none>           <none>
kube-system   kube-controller-manager-node2                        1/1     Running   35         3d2h          node2   <none>           <none>
kube-system   kube-controller-manager-node3                        1/1     Running   4          6h27m          node3   <none>           <none>
kube-system   kube-proxy-cj42b                                     1/1     Running   5          3d2h          node2   <none>           <none>
kube-system   kube-proxy-nt7zn                                     1/1     Running   2          3d2h          node1   <none>           <none>
kube-system   kube-proxy-s8vgt                                     1/1     Running   1          6h27m          node3   <none>           <none>
kube-system   kube-scheduler-node1                                 1/1     Running   30         3d2h          node1   <none>           <none>
kube-system   kube-scheduler-node2                                 1/1     Running   33         3d2h          node2   <none>           <none>
kube-system   kube-scheduler-node3                                 1/1     Running   5          6h27m          node3   <none>           <none>
monitoring    grafana-74f7c48746-9dvxf                             1/1     Running   0          3h31m    node2   <none>           <none>
monitoring    grafana-74f7c48746-txwrv                             1/1     Running   0          3h30m    node3   <none>           <none>
monitoring    loki-0                                               1/1     Running   0          4h39m    node2   <none>           <none>
monitoring    loki-promtail-785qg                                  1/1     Running   4          3d1h     node2   <none>           <none>
monitoring    loki-promtail-8fnkw                                  1/1     Running   1          3d1h   node1   <none>           <none>
monitoring    loki-promtail-8vwpf                                  1/1     Running   1          6h27m    node3   <none>           <none>
monitoring    prometheus-alertmanager-6fcfd7bb84-mvm9k             2/2     Running   2          5h11m    node3   <none>           <none>
monitoring    prometheus-alertmanager-6fcfd7bb84-ndbhd             2/2     Running   0          3h27m    node2   <none>           <none>
monitoring    prometheus-kube-state-metrics-79f5b77cb8-4kh9x       1/1     Running   1          5h24m    node3   <none>           <none>
monitoring    prometheus-node-exporter-278sb                       1/1     Running   1          6h22m          node3   <none>           <none>
monitoring    prometheus-node-exporter-czrbw                       1/1     Running   4          3d          node2   <none>           <none>
monitoring    prometheus-node-exporter-xfw7s                       1/1     Running   1          3d          node1   <none>           <none>
monitoring    prometheus-pushgateway-5d85697467-88mp5              1/1     Running   0          3h27m    node2   <none>           <none>
monitoring    prometheus-pushgateway-5d85697467-hff9t              1/1     Running   1          5h24m    node3   <none>           <none>
monitoring    prometheus-server-0                                  2/2     Running   0          3h21m    node2   <none>           <none>
monitoring    prometheus-server-1                                  2/2     Running   0          3h20m    node3   <none>           <none>
sqirly        postgresql-545d95dcb9-npnbj                          1/1     Running   0          5h12m   node1   <none>           <none>
sqirly        sqirly-5d674b8d5b-ktbsw                              1/1     Running   1          6h16m    node3   <none>           <none>
sqirly        sqirly-5d674b8d5b-mnzzv                              1/1     Running   5          5h24m   node1   <none>           <none>
sys           cert-manager-678bc78d5d-gmb86                        1/1     Running   1          5h24m    node3   <none>           <none>
sys           cert-manager-cainjector-77bc84779-bq9xx              1/1     Running   4          5h24m    node3   <none>           <none>
sys           cert-manager-webhook-5b5485577f-5wz6c                1/1     Running   1          5h24m    node3   <none>           <none>
sys           distcc-deployment-5d6fb547d7-pjhd7                   1/1     Running   1          5h24m    node3   <none>           <none>
sys           metallb-controller-9f46bdfcb-zbtsw                   1/1     Running   1          6h12m    node3   <none>           <none>
sys           metallb-speaker-4bpqd                                1/1     Running   1          3d2h          node1   <none>           <none>
sys           metallb-speaker-t2jpt                                1/1     Running   4          3d2h          node2   <none>           <none>
sys           metallb-speaker-w4q2s                                1/1     Running   1          6h27m          node3   <none>           <none>
sys           minio-6df88b9995-x8qpt                               1/1     Running   1          5h12m    node3   <none>           <none>
sys           nfs-storage-nfs-client-provisioner-8fcb6b749-nskl4   1/1     Running   1          5h12m    node3   <none>           <none>
sys           nginx-ingress-controller-jxhtn                       1/1     Running   0          3h57m   node1   <none>           <none>
sys           nginx-ingress-controller-kk6sn                       1/1     Running   0          3h57m    node2   <none>           <none>
sys           nginx-ingress-controller-s4ndr                       1/1     Running   1          3h57m    node3   <none>           <none>
sys           nginx-ingress-default-backend-5c667c8479-hn769       1/1     Running   1          5h24m    node3   <none>           <none>
sys           nginx-ingress-default-backend-5c667c8479-zhnl8       1/1     Running   0          4h53m   node1   <none>           <none>

Grafana K8s Cluster summary

Grafana Nginx-Ingress Controller


This setup is not truly highly available. The whole cluster depends on the Synology as data storage. You could improve this further by replacing the centralized NAS with a distributed solution. But besides that the cluster is very solid and scalable. Rebooting any of the NUC’s, your application experiences almost zero down time. In case of a node outage, requests active on the broken node will be aborted. Also if the broken node happens to be the active master. But it will failover automatically to another master node.

comments powered by Disqus