Setting up a High-Availability Kubernetes cluster using Talos Linux

I learned about Talos Linux, so I wanted to try it out in a "real" production scenario.

I also wanted a bit of a challenge, so I gave myself some additional constraints:

Everything must be controlled by infrastructure-as-code.
No using any "outside" services - no registered domain name, and no software-as-a-service that isn't deployed on a node I have root access to.
All services should be deployed in a highly-available manner - with 6 nodes I should be able to tolerate a few failures!

Part 1: Setting up the Prerequisites

The first thing that I saw an immediate need for was DNS resolution of the custom URL I'll be using. I chose the domain name testlab.kube for this. Following the precedent I had set with my home lab, I chose the format of <member>.<nodepool>.<cluster>.infra.testlab.kube for node DNS names. I built a simple CoreDNS image and deployed it on a node in the Viviz Cluster. With this, the problem of DNS resolution was solved.

After reading the Talos documentation, I determined that the best course of action was to have a custom ISO, with the flag talos.config set, pointing at a config file stored in MinIO. While this could be done using solely factory.talos.dev, it didn't fulfill the additional challenge of no SaaS not under my control. Thankfully, Sidero Labs also provides the source for the image factory, so it wasn't difficult to stand that up as well. I also needed to set up my own version of discovery.talos.dev, which was also easy because of the provided source.

Finally, I needed a way for the talosctl commands to hit any control-plane node while working against the same endpoint. I created a new DNS entry for controlplane.testlab.kube, and set up routing so that Traefik would send traffic for ports 50000 and 50001 to an instance of HAProxy. From there, HAProxy acts as a round-robin reverse proxy, sending requests to the control plane nodes while allowing TLS passthrough for talosctl to work properly.

An excerpt of the relevant HAProxy config is below; the resolvers block is necessary to resolve the custom DNS name.

resolvers talosnameserver
    nameserver ns1 xx.xx.xx.xx:53

backend talos_controlplane
    mode tcp
    balance     roundrobin
        server kube-cp-01 karina.aespa.sm.infra.testlab.kube:50000 resolvers talosnameserver
        server kube-cp-02 seulgi.redvelvet.sm.infra.testlab.kube:50000 resolvers talosnameserver
        server kube-cp-03 irene.redvelvet.sm.infra.testlab.kube:50000 resolvers talosnameserver

With this, I had everything set up to start installing Talos Linux on some nodes!

Part 2: Standing up the Cluster

With all the prerequisites set up, the next step was to install Talos Linux on all of the nodes that will form the cluster. I created two ISOs, one for control-plane nodes and one for worker nodes. The kernel command line arguments for each were as follows:

controlplane:

ip=:::::::<dns0-ip>:<dns1-ip>: talos.config=http://console.minio.bootstrap.server/talos/controlplane.yml

worker:

ip=:::::::<dns0-ip>:<dns1-ip>: talos.config=http://console.minio.bootstrap.server/talos/worker.yml

The ip argument provides DNS information so that bootstrap.server can be resolved to return the initial configuration. After bootstrapping the first controlplane node, each node joined the cluster within 10 minutes. I then ran talosctl patch mc, updating the corresponding hostname for each machine. I added 9 nodes total - 3 control plane and 6 worker nodes, and I was quite surprised that it worked out-of-the-box.

Part 3: Bootstrapping the Cluster (and a little bit more)

With the cluster created, I settled on using ArgoCD to deploy services into the cluster. I created a devcontainer configuration that does the following on startup:

install kubectl and helm
install helmfile
install talosctl and used this to retrieve the kuebconfig for the cluster
install miscellaneous utility tools (step-cli, grpcurl, etc.)

With this, it was easy to then run a quick helmfile apply to get ArgoCD into the cluster. Afterwards, I used kubectl apply to add a single resource, following the app-of-apps pattern to deploy all of the necessary services into the cluster. The yaml definitions for the apps are easily pulled from a pre-existing private Forgejo repository.

In order to bootstrap the cluster, I needed to provide both a way of provisioning volumes, and a way to gather insights into the cluster. For this, I deployed metrics-server and local-path-provisioner; this part was especially straightforward.

After this, I was able to deploy cert-manager, step-ca, trust-manager, and Kyverno, allowing the certs for *.testlab.kube to be trusted in individual pods. The final step was adding MetalLB to assign an IP to the ingress gateway. With an IP address assigned, the gateway accepted traffic and Istio routed it to the services as expected!

> kubectl top nodes
NAME       CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   
giselle    3626m        45%      11378Mi         48%         
irene      1234m        31%      3345Mi          63%         
joy        2574m        43%      10902Mi         70%         
karina     1114m        28%      3086Mi          58%         
ningning   3872m        48%      8599Mi          36%         
seulgi     2796m        70%      3256Mi          61%         
wendy      5675m        47%      19229Mi         40%         
winter     3678m        46%      7786Mi          33%         
yeri       4106m        69%      7808Mi          50%

Afterthoughts

I'm glad I decided to undertake this endeavor. I had set up Talos Linux before with only 3 nodes (1 control plane, 2 worker) and this approach was much more challenging. The additional constraints allowed me to learn better how Talos Linux works under-the-hood, as well some of the nuances of certificate management.

I think that there is still room for improvement, if this were being used in a "true" production setting. The hostname patching and DNS editing is still very high-touch. If I were provisioning tens or hundreds of nodes, I would set up some sort of API that would accept metadata about a node that is being created (e.g. through Terraform or similar). I would then use a webserver to serve a host-specific config, using the IP of the request to retrieve the metadata and fill in a configuration template. If I ever need to set up a cluster with more than 20 or so nodes, I'll probably do that. I think this would also be useful for heterogeneous node sets, so that specific labels could be automatically applied based on the known specs of each node.

Additionally, retrieving config over HTTP is not ideal. If I were doing this in a "real" scenario, I'd provide the certificate information for my domain via the talos.config.inline parameter.

If you'd like to see the current state of this cluster, I host a public instance of Grafana here that provides some dashboards.