We’ve had a home server for basically as long as I can remember, with very few breaks between for the most part. In Sacramento I had a series of servers, generally OpenBSD or FreeBSD, which we re-created when everything died during the move to Indiana. But even before that, in Australia, I had a pile of machines configured in a really terrible network (I have successfully repressed most of the horrors of thin coax networking), running a mix of SuSE and Slackware Linux.
Back to Australia and we setup a FreeBSD machine here too, which we kept going when we moved out to Horsham and the rest is history. Shortly after starting work at my current employer, I shut down the FreeBSD machine and moved everything over to Linux, in order to get more familiar with LXD. This worked a treat, as I managed to learn a lot very quickly by breaking things, fixing them, and just generally figuring out how to do things I probably shouldn’t with it.
Some time in the last year or two I switched to Docker, mainly out of curiousity. I have no plans to leave where I currently am, but if I did have to the combination of FreeBSD and LXD knowledge aren’t super-marketable, Docker was the new hotness. Plus, with Docker most of the hard work is done for you so I figured I could simplify some of my home sysadmin duties to boot. It worked rather well, using Docker Compose and I could bring everything up again very quickly if I had to.
But again, the new hotness is Kubernetes, and I know nothing about it. I’ve wanted to look at it for a few years now, but it’s always been super intimidating and I didn’t have a good excuse for it… but last week I decided to have a crack at it.
So I wouldn’t have to take down any of my docker stuff, my original plan was to run it in some LXD containers, but I ran into some networking grief that I thought might be related to that: my pod networking worked between pods on the same node, but failed on cross-node stuff. So I eventually pieced together the remains of the abovementioned FreeBSD machine (an Athlon II, with a whopping 8GB RAM), installed Ubuntu on it and made it a Kubernetes control plane node (avoiding the slavery-adjacent nomenclature where I can).
I then used kubeadm to join a node on the main server to it, and started setting everything up and… same issue! It wasn’t immediately apparent that it was: I was getting a timeout with Traefik trying to access the Kube-api service, but it turned out it was the same issue above.
I spent all week on this in my spare time, and my brain was essentially mush at this point, but on Saturday morning I finally got it working. The problem? It doesn’t seem to be documented but I found it in a throwaway comment I very nearly skipped over: if you’re using flannel for “simple networking” it seems it absolutely cannot tolerate you using anything but a very specific CIDR range for the pod networking. I changed this, because I wanted it to butt up against my real networks so I wouldn’t forget it was there, and that caused problems. Copy+pasting from the guides I was following and it all “just worked”.
But by this point I had destroyed all my Docker containers, stopped LXD, rebooted the machine and done all sorts of stuff, just generally making a huge mess of our “home prod”, so I spent most of Saturday figuring out how to stand up the services we use and now most of them are happily running on Kubernetes. It’s quite fragile in that if that old Athlon II machine falls over the “cluster” will stop, but I will solve this later, I just haven’t worked out how I want to do it yet: I’ve looked at Raspberry Pis, Rock64s, Intel NUCs, and I’m still not sure what I want.
I’m thinking that later on I will make the current server a dumb NFS server, and use NFS to back the containers, then setup a couple of “compute nodes” and maybe some SBCs for a control plane, not sure what I want to do yet. However it’s become very clear that there’s quite a bit of our energy consumption that could be slashed if I replaced the old dual-Xeon machine, so I will have to do something sooner rather than later. Naturally there are basically zero cheap rack-mount cases around when I want them though.
I did have quite some grief getting the Unifi stuff back up and running - it seems if the controller changes IP, then it will hate life, but solving it was fairly simple: SSH into each device, run
set-inform http://controller-ip:8080/inform. Exit out of it, and reboot the device (to ensure it comes back up after a reboot, it will show on the controller almost immediately) - which I did not, I instead used the opportunity to update the firmware on each since by this point everyone else was asleep.