Weeknotes 247
22nd March, 2026
“ZigBee woes”
-
As you might remember I’ve been having several ZigBee related woes for a while, and this week was no different. This time the investigation was prompted by Zigbee2MQTT spontaneously restarting, or trying to. The container seemed to die and then try and restart by couldn’t, and that would continue for a bit.
I suspect this has been happening for a while, but I never noticed because I didn’t have any sort of alerting setup. Now, through Dockhand, I get notifications when Docker container events happen. (In fact, I get too many notifications, my watch goes berserk during these outages!).
From the logs it seemed the Zigbee2MQTT container dying was due to it not being able to reach the ZLSB-06M over the network. (I’m not sure why it has to crash when this happens?). But this was sporadic, it would be fine for a long time, and then suddenly start misbehaving.
Eventually, I noticed the container would die during large file downloads. I could reliably reproduce the problems by triggering a large download and after a couple of minutes it would consistently cause the container to die. And when the download stopped, the container would recover. So large amounts of network traffic were causing some sort of instability or congestion to the point that the Zigbee2MQTT and SLZB-06M could not reach each other over the network.
I immediately had suspicions – the Virgin Media Hub. I’ve been using this as a temporary stop gap for two years. Let’s just say that I doubt these devices are made to be the most reliable. Now that I had a reliable reproduction, I decided to start swapping out hardware. My network plans are still undecided, so as another temporary stop gap, and to eliminate network hardware as the source of the problem, I purchased a Ubiquiti USW-Flex-Mini switch to plug all my devices into, leaving the Virgin Hub to broadband routing duties only.
And this fixed the problem. It was hardware.
I think I naively thought that one Ethernet stack was much the same as others these days, but it seems I was wrong because once I replaced the switching hardware with something quality the problems went away.
-
A single-binary CLI tool that detects a software project’s toolchain, configuration, and conventions, then outputs a structured report.
-
MacBook Neo Teardown: Apple’s Most Repairable Laptop?
This is very promising. I’m hopeful for newer, more pro, machines in the future. I found the modular jacks and ports surprising in this teardown. I didn’t know that was a thing, I’ve only ever seen them soldered directly to the board. That makes one of the most common repairs – broken ports – far simpler it seems.
And they seem to provide manuals, which shocked me.
-
…for the last few months I’ve been iterating on kiq, a new administration interface for Sidekiq, a speedy terminal application based on Kerrick Long’s ratatui_ruby gem
-
sandbox-exec: macOS’s Little-Known Command-Line Sandboxing Tool
I wonder how many tiny tools are lurking in macOS somewhere waiting to be found.
-
As I’ve noted before, why do events all need to happen in the same week? There must be a German word for this 🤔
This week I was out in the evening for three nights in a row
#madladTuesday was the regular monthly York Ruby and it was nice to catch up with people as usual. Wednesday was comedian Rob Auton. And Thursday was John Bramwell of I Am Kloot fame.
I wasn’t sure what sort of show John Bramwell would put on having never seen him or I Am Kloot live before, but it was fantastic mixture of storytelling and fabulous musical performances.
But there’s more. Due to “unforeseen circumstances”, and following the Queen’s lead, my official birthday was moved to Saturday and we went out for a nice dinner too.
Normal service is now resumed.
-
Spurred on by the ZigBee issues, I moved a project I’d been thinking about for a while forward – log aggregation in the ol’ Home Lab. Searching across individual log files to try and find out what is wrong with something is not fun, so having a system that ingests all the logs and makes them searchable would make things much easier.
Grafana is often cited in situations such as these. In the past my experiences with Grafana have not been great. But these were also situations in which I did not set anything up myself, and my understanding of how things were working was very limited. I think this reduced the usefulness of it for me.
However, I have no real experience with these sort of platforms (I’ve used DataDog, but that is out of scope, and budget), so just went with Grafana. With LLM help it was actually fairly easy to get going. I’ve been finding the combination of Docker and AI to be super-effective at getting these kind of software stacks setup.
I didn’t actually use Grafana to diagnose anything in the end, but it will be useful in the future for answering the question “What happened at 3:07am?” by searching across all the log files at once. I will learn more about how to operate Grafana as time goes on and problems that need diagnosing appear.
-
Starlink Mini as a failover (Via Harry)
This is very cool. And cheap. Shame about him.
-
Nobody Gets Promoted for Simplicity
I’ve seen engineers (and have been one myself) create abstractions to avoid duplicating a few lines of code, only to end up with something far harder to understand and maintain than the duplication ever was.
God forbid you should ever have to change something in two places. Oh the horror!
The actual path to seniority isn’t learning more tools and patterns, but learning when not to use them. Anyone can add complexity. It takes experience and confidence to leave it out.
And permission from others.
-
OPTICIAN SANS – “A free font based on the historical eye charts and optotypes used by opticians world wide.”