Setting up a High-Availability Kubernetes cluster using Talos Linux
I learned about Talos Linux, so I wanted to try it out in a "real" production scenario.
I also wanted a bit of a challenge, so I gave myself some additional constraints:
- Everything must be controlled by infrastructure-as-code.
- No using any "outside" services - no registered domain name, and no software-as-a-service that isn't deployed on a node I have root access to.
- All services should be deployed in a highly-available manner - with 6 nodes I should be able to tolerate a few failures!
Part 1: Setting up the Prerequisites
The first thing that I saw an immediate need for was DNS resolution of the custom URL I'll be using.
I chose the domain name testlab.kube
for this.
Following the precedent I had set with my home lab, I chose the format of <member>.<nodepool>.<cluster>.infra.testlab.kube
for node DNS names.
I built a simple CoreDNS image and deployed it on a node in the Viviz Cluster.
With this, the problem of DNS resolution was solved.
After reading the Talos documentation, I determined that the best course of action was to have a custom ISO, with the flag talos.config
set, pointing at a config file stored in MinIO. While this could be done using solely factory.talos.dev, it didn't fulfill the additional challenge of no SaaS not under my control. Thankfully, Sidero Labs also provides the source for the image factory, so it wasn't difficult to stand that up as well. I also needed to set up my own version of discovery.talos.dev, which was also easy because of the provided source.
Finally, I needed a way for the talosctl
commands to hit any control-plane node while working against the same endpoint. I created a new DNS entry for controlplane.testlab.kube
, and set up routing so that Traefik would send traffic for ports 50000 and 50001 to an instance of HAProxy. From there, HAProxy acts as a round-robin reverse proxy, sending requests to the control plane nodes while allowing TLS passthrough for talosctl
to work properly.
An excerpt of the relevant HAProxy config is below; the resolvers
block is necessary to resolve the custom DNS name.
resolvers talosnameserver
nameserver ns1 xx.xx.xx.xx:53
backend talos_controlplane
mode tcp
balance roundrobin
server kube-cp-01 karina.aespa.sm.infra.testlab.kube:50000 resolvers talosnameserver
server kube-cp-02 seulgi.redvelvet.sm.infra.testlab.kube:50000 resolvers talosnameserver
server kube-cp-03 irene.redvelvet.sm.infra.testlab.kube:50000 resolvers talosnameserver
With this, I had everything set up to start installing Talos Linux on some nodes!
Part 2: Standing up the Cluster
With all the prerequisites set up, the next step was to install Talos Linux on all of the nodes that will form the cluster. I created two ISOs, one for control-plane nodes and one for worker nodes. The kernel command line arguments for each were as follows:
controlplane:
ip=:::::::<dns0-ip>:<dns1-ip>: talos.config=http://console.minio.bootstrap.server/talos/controlplane.yml
worker:
ip=:::::::<dns0-ip>:<dns1-ip>: talos.config=http://console.minio.bootstrap.server/talos/worker.yml
The ip
argument provides DNS information so that bootstrap.server
can be resolved to return the initial configuration.
After bootstrapping the first controlplane node, each node joined the cluster within 10 minutes.
I then ran talosctl patch mc
, updating the corresponding hostname for each machine.
I added 9 nodes total - 3 control plane and 6 worker nodes, and I was quite surprised that it worked out-of-the-box.
Part 3: Bootstrapping the Cluster (and a little bit more)
With the cluster created, I settled on using ArgoCD to deploy services into the cluster. I created a devcontainer configuration that does the following on startup:
- install
kubectl
andhelm
- install
helmfile
- install
talosctl
and used this to retrieve thekuebconfig
for the cluster - install miscellaneous utility tools (step-cli, grpcurl, etc.)
With this, it was easy to then run a quick helmfile apply
to get ArgoCD into the cluster.
Afterwards, I used kubectl apply
to add a single resource, following the app-of-apps pattern to deploy all of the necessary services into the cluster. The yaml definitions for the apps are easily pulled from a pre-existing private Forgejo repository.
In order to bootstrap the cluster, I needed to provide both a way of provisioning volumes, and a way to gather insights into the cluster. For this, I deployed metrics-server and local-path-provisioner; this part was especially straightforward.
After this, I was able to deploy cert-manager, step-ca, trust-manager, and Kyverno, allowing the certs for *.testlab.kube
to be trusted in individual pods.
The final step was adding MetalLB to assign an IP to the ingress gateway.
With an IP address assigned, the gateway accepted traffic and Istio routed it to the services as expected!
With the bootstrapping complete, I then deployed Rook to the cluster alongside local-path-provisioner
Part 4: Observability
The final piece in this exercise was providing insight into the cluster's health. I already had observability experience from my homelab, so I just replicated the process, with a few minor adjustments.
This time, instead of Promtail, I chose Grafana Alloy for sending logs to Loki. I also deployed Prometheus in high-availability mode with sharding, treating Mimir as the storage backend. Mimir's deduplication worked as expected, and I was able to view metrics on the Grafana dashboard!
Afterthoughts
I'm glad I decided to undertake this endeavor. I had set up Talos Linux before with only 3 nodes (1 control plane, 2 worker) and this approach was much more challenging. The additional constraints allowed me to learn better how Talos Linux works under-the-hood, as well some of the nuances of certificate management.
I think that there is still room for improvement, if this were being used in a "true" production setting. The hostname patching and DNS editing is still very high-touch. If I were provisioning tens or hundreds of nodes, I would set up some sort of API that would accept metadata about a node that is being created (e.g. through Terraform or similar). I would then use a webserver to serve a host-specific config, using the IP of the request to retrieve the metadata and fill in a configuration template. If I ever need to set up a cluster with more than 20 or so nodes, I'll probably do that. I think this would also be useful for heterogeneous node sets, so that specific labels could be automatically applied based on the known specs of each node.
Additionally, retrieving config over HTTP is not ideal.
If I were doing this in a "real" scenario, I'd provide the certificate information for my domain via the talos.config.inline
parameter.