Welp, almost three years after I bought an original Xserve for next to nothing, I finally managed to procure a hard disk caddy for it. After shipping (where I think I got hosed, $17.50 for a padded mailer that weighs all of about 500g tops?) it cost more than the Xserve did, but who cares, I had the fun budget to spare (spoiler: I forgot about buying Diablo 4 next month, so I really didn’t).
Anyway, it arrived Friday, so after work I threw a 40GB Seagate in it, fired up the OpenBSD 6.7 disc that was still in the drive and set about installing it. Then… it wouldn’t boot. Fucked about a bit trying to get into OpenFirmware in order to “bless” the boot disk, and it stopped booting completely, showing four blinks of the blue LEDs on the front that typically indicates some sort of POST failure.
I tried removing the RAM, the GPU, and the extra NIC, reseating them all, no luck. Someone suggested (as lots of folks do, though I’ve not found any credible evidence it’s the case) that Apple lock these to only work with Apple hard disks, but removing the disk was the first thing I tried.
Next I tried removing the PRAM battery (mainly just to inspect that it was still good and hadn’t pissed everywhere) and I tested it, 3.56V is close enough to the labelled 3.6 that I can conclude there’s nothing wrong with it.
On Saturday, I did a bit more research, plugged a null-modem serial cable in, configured
screen to 56700 baud, and upon pressing enter a few times was met with what appeared to be a debugger:
FFF801FC 4BFFFFC4 >
No idea what those codes mean. I could step through it, or I could press
g to “go”, where it would think for a bit, do another cycle of blue blinkies, then present the same prompt.
This got me thinking that maybe I either had a dead-stick of RAM (I am not buying PC2100 DDR for fuck’s sake, this is a ten-dollar Mac!), or the CPU was unhappy, so I pulled the CPU card out and reseated it, and it worked! With the OpenBSD disc still in the CD-ROM I got the boot console.
Now back to the original problem, getting it into OFW. I tried several times the apparent physical method of (mirrored from here, but it looks like it’s originally copy+pasted from the Xserve User Guide, page 56):
- With the power button off, hold in the system identifier button while you press the on/standby button
- Continue holding the system identifier button until the top row of blue lights blinks sequentially.
Release the system identifier button. The rightmost light in the bottom row turns on. Press the system identification button to light the next light in the bottom row, moving from right to left. Press the button again to change lights.
The lights in the bottom row indicate (from right to left):
- Light 1 (far right): Start up from a system disk in the optical drive (on a system with an optical drive). This also ejects any disc already in the optical drive.
- Light 2: Start up from a network server (NetBoot)
- Light 3: Start up from the internal drive (leftmost drive if more than one).
- Light 4: Bypass the current startup disk and startup from any other available startup disk.
- Light 5: Begin target disk mode (all drives, including optical drive, will show up)
- Light 6: Restore the systems default setting (reset NVRAM)
- Light 7: Enter Open Firmware (via the serial port if no monitor and keyboard are connected.
When the light for the action you want is on, hold in the system identifier button for at least 2 seconds, until all the lights in the top row are on.
Release the button
This did not work for me, at all. I tried several times using the front and rear “identify” buttons. I made sure it wasn’t locked. I tried locking it, unlocking it. I tried it with the case on. It didn’t work. The only thing I can think of is this is for the later slot-loading Xserve, and as mine is the earlier non-slot-loading it does not support this menu?
Anyway, I set about looking for another method, and someone mentioned using the typical Mac method of holding down Option+Cmd+O+F. Well, I don’t have a Mac keyboard, but I found a USB one (a Logitech, $20-ish from Kmart years ago), plugged it in, and I reasoned that the PC equivalents for Option and Command were Alt and Start, so gave that a go and whaddya know:
Apple RackMac1,1 4.4.4f1 BootROM built on 06/22/02 at 00:22:19
Copyright 1994-2002 Apple Computer, Inc.
All Rights Reserved.
Welcome to Open Firmware, the system time and date is: 00:53:44 01/01/1904
To continue booting, type "mac-boot" and press return.
To shut down, type "shut-down" and press return.
0 > setenv boot-device hd:,ofwboot /bsd ok
0 > reset-all
Side-note: fwoar this machine’s old enough to drink in the USA later this month!
Before I know it, I have it booting from the hard disk, right up until where I’m pretty sure it’s supposed to be starting
getty and then… nothing. God damn it, I bet the serial terminal is not enabled. That’s a problem, because the GPU is still not working, so I don’t have any way to see what’s on the screen without the serial console.
Reboot, stop it at the prompt, enter
boot -s. Mount
/usr (the former already mounted, but I needed to remount it read-write),
export TERM=vt100 so that
vi would start, and I’m able to edit it to enable the terminal on the serial console.
One more reboot, and I’m in business. Now to figure out what to do with 2x 1GHz G4 PPC cores and 512MB of RAM? At the moment, turn it off and put it in the garage because it’s too fucking loud.
Immediately after lunch today (specifically at 1:13pm), I rang up Wades who installed our heater, because it’s about time for our annual service. I think we only need to service it every two years, and we had them do it last year if memory serves, but it’s a good idea - in my humble opinion - to have them give it a once-over before we fire it up each winter.
But their answer surprised me: due to a lack of gas fitters in the area, they aren’t currently doing service calls at all… and they can’t recommend anyone either.
I asked, and it won’t affect our warranty - if it ceases working they’ll come out for a repair call, and the warranty will stay in tact.
But jeeze, that’s a new one.
Last week, we had a furniture delivery, and part of it was too large to fit via the front door (something I wish to rectify when we eventually replace the front door, but alas that’s very far down our list of priorities), so it had to come in through the garage. But the VE was in the way to do that, and since the truck was already in the driveway, I figured I’d back it out through the rear door.
Long story short, for the first time in nearly two years since we’ve lived here, one of us (and it was me) managed to catch a mirror on the edge of the garage bricks… not a bad effort considering it’s quite tight a squeeze, but alas, it finally happened.
The mirror housing bounced and was fine, but the mirror itself cracked to shit making it impossible to see out of. It also made the car potentially unroadworthy (I’m really not sure), and as we were heading out for Easter I did not want to risk it, so after ringing the local wreckers they suggested just having the local glass mob cut one to fit. Hey that sounds great, they quoted the price at $25, so I had them do it.
“It won’t be perfect”, the guy said. “Oh, and it’ll be flat rather than convex.” I guess I can live with that? Left the removed mirror there, came back at the end of the day, paid $25, fitted it… I did not realize how much the lack of convex would bother me. I feel like I can’t see shit, and between it and having my confidence dashed, backing into the garage is quite the ordeal now.
So today, fuck it, out to the wreckers. Waited for him to pull one (wrong color, but the painted part pops off and I could put mine on, but I was hoping to just pop the electric part out and not have to replace the entire housing… oh and the first one he looked at was no good, the mirror flopped around due to one of the internal mounts being broken). It was $85, which in retrospect I should have just paid to help keep the car closer to original.
Replacing it was fairly straight-forward… I used one of my plastic panel tools to pop the painted part off both pieces, and essentially pushed the housing off the three ball+cup mounts, being careful not to break any of them (as that’s what was broken on the defective one at the wreckers, I believe). Unplug the wires, transfer it across, pop it back in, pop the painted panel back on, and away we go.
It’s been a minute since I last had to travel for work, but that’s coming up again. This means for remote access purposes, I need a VPN to my home network. I last accomplished this simply using the L2TP functionality built right in to my Unifi Security Gateway, but since I got rid of that I simply never fixed this again. The obvious solution is Wireguard, but why configure that by hand when I can use Tailscale to do it for me?
Why indeed, and since I’ve heard good things about their offering I decided to take a look at it. First step, let’s run it in an LXD container, so I can blow it away if I need to without polluting the rest of my network. That was fairly painless:
~$ lxc launch ubuntu-minimal:focal tailscale
fwaggle@ghast:~$ lxc shell tailscale
root@tailscale:~# apt update
-- SNIP --
18 packages can be upgraded. Run 'apt list --upgradable' to see them.
root@tailscale:~# apt upgrade -y
-- SNIP --
root@tailscale:~# apt autoremove
-- SNIP --
root@tailscale:~# curl -fsSL https://tailscale.com/install.sh | sh
Installing Tailscale for ubuntu focal, using method apt
-- SNIP --
Installation complete! Log in to start using Tailscale by running:
root@tailscale:~# tailscale up
To authenticate, visit:
-- SNIP --
For a second device, I installed it on my phone, turned off Wifi, and whaddya know, I can ping something.
Restarting tailscale with
tailscale up --advertise-exit-node and flipping the exit node on on my phone meant I got my home IP despite not being on the home wifi networks, so I’m counting that as a success.
Less successful was accessing the other services, and it took a bit to figure out why that is. My first port of call was checking connectivity at the container, which I did so using ICMP. This didn’t work:
From _gateway.lxd (10.13.0.1): icmp_seq=1 Redirect Host(New nexthop: _gateway (10.255.0.1))
From _gateway (10.255.0.1) icmp_seq=1 Time to live exceeded
This is actually not unexpected, I suppose. Since this machine is actually three machines in a trench coat (NFS server, LXD server, and a K8s node), and runs two BGP peers on two different IPs, I figured it was something fucky with the routing: It was going up to my router, which tried to send it back down the same path, then the TTL expired.
But a service hosted on LXD should work, because LXD’s bridge network should route it sensibly. It did not, from my phone. More puzzling, actually, a
curl of a service hosted on Kubernetes worked from the container.
I soon realized I needed to convince Tailscale to advertise these routes, it seems it won’t expose RFC1918 addresses even if it’s declared as an exit node… which is actually a very sensible default, really. So I configured it to have access to my entire LAN via the
--advertise-routes= parameter to
tailscale up, approved it in the control panel, and it worked. I later retracted that, opting to only allow access to the services I want.
The final piece of the puzzle (after being greeted with a 403) was to allow the IP of the tailscale container in my
lan-only middleware in Traefik, and I’m away.
I’m as yet undecided on whether I want to allow access to the K8s control plane via this or not, but I can do most everything else, including accessing Home Assistant.
Woke up this morning to an email from Let’s Encrypt that the certificate for our router is expiring soon - weird, because it should automatically renew. So I logged into the router to take a look, and
acme.sh is failing because
socat is throwing a segmentation fault.
No drama, I’ve seen this before (last time it was
opkg, before that it was
tcpdump and I had trouble setting up BGP due to
bird doing it until I switched to the IPv4 only version)… typically a reboot solves it, and since Duncan was still asleep and Sabriena was reading, it should be over nice and quick. But after a few minutes it didn’t come back, I started to worry.
Sure enough, upon plugging a serial cable in and checking the logs, I’m greeted with:
[ 14.310839] Run /sbin/init as init process
[ 14.468395] SQUASHFS error: xz decompression failed, data probably corrupt
[ 14.475296] SQUASHFS error: squashfs_read_data failed to read block 0x94d8a
[ 14.498128] SQUASHFS error: xz decompression failed, data probably corrupt
[ 14.505020] SQUASHFS error: squashfs_read_data failed to read block 0x94d8a
[ 14.527841] SQUASHFS error: xz decompression failed, data probably corrupt
[ 14.534735] SQUASHFS error: squashfs_read_data failed to read block 0x94d8a
[ 14.541741] Starting init: /sbin/init exists but couldn't execute it (error -14)
[ 14.549156] Run /etc/init as init process
[ 14.554708] Run /bin/init as init process
[ 14.558790] Run /bin/sh as init process
[ 14.562770] Starting init: /bin/sh exists but couldn't execute it (error -14)
[ 14.569930] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux D.
[ 14.584125] Rebooting in 1 seconds..
I tried a few things to kick it loose, up to and including a clean flash of OpenWRT, to no avail. Interestingly though, I tried the stock EdgeMax firmware and that functions fine, and since at this point we were close to two hours without internet and I have work tomorrow, I figured this would have to do.
Setting up everything else took the better part of the afternoon and it was basically dinner time before I had everything working correctly. We still don’t have full internal DNS which is annoying, but we can turn on and off the lamps correctly. What a way to spend my Sunday!
But so now I have to work out what went wrong and what to do about it. I’m suspecting that it’s probably bad flash, though I don’t have a good explanation for why it’s not failing with the stock firmware… possibly it’s due to layout differences or something. The EdgeMax firmware is serviceable, but not something I think I’d want to keep long-term.
Do I go back to drinking the Unifi kool-aid? It seems like that’s probably the simplest solution, though it does leave a bad taste in my mouth. I could get a UDM-SE and do away with a bunch of equipment in the rack and have an integrated controller, and a 10gig backhaul to the server rack as well. But do I really wanna go down that road? Not the least of my concerns are that it’s a fuckload of money (roughly $1050AUD) if I don’t really like it.