Systemd through the eyes of a musl distribution maintainer

by awilfoxon 1/6/24, 7:02 AMwith 152 comments
by chasilon 1/6/24, 3:01 PM

I use much functionality in systemd that was not present in SysV init, and I really appreciate it. It has never crashed any OS that I have run.

However, there are a few aspects of it that are inconvenient.

Automount units use an unintuitive naming scheme, and you are not free to name them as you wish (as you might for a socket unit). If you are mounting an NFS volume immediately below the root directory, you don't see the problem, but if the mount is several directories deep and/or uses ASCII symbols (non-alphanumeric), it is not pretty.

Socket units require two files per port. When I am moving complex inetd.conf setups to Linux, it's far easier to implement them with busybox inetd than convert dozens/hundreds of services to unit files, despite the increased functionality. Somebody has probably written some scripting to do this.

I am not aware of any include directive for my own directories, so I don't have to place everything in /etc/systemd/system. There probably is a way to do this, and I am betraying my ignorance.

And my, things can get messy in a hurry in /etc/systemd/system.

I don't know how to configure users to be able to maintain their own (personal) units.

And lastly, it's so seductive that I have no idea how to do many things in other operating systems that I easily do in Linux. I wish this itself was not a walled garden (but I'm not leaving).

by INTPenison 1/6/24, 1:19 PM

Nice article, after years of emotional flaming finally something fair and balanced about systemd.

The point about there existing no competition and no respect for glibc alternatives is very valid and to that I can only say; if you build it, they will come.

Systemd had the advantage of Red Hat backing so they got a great headstart. But in general people will use whatever works. So roll your sleeves up and get cracking.

by greywon 1/6/24, 3:08 PM

Reading the section about resolved reminds me of musl having a "broken" stub resolver implementation for years (whenever the dns payload was larger than 512 bytes). "Broken" because the TCP fallback was supposedly intentionally not added for complexity reasons.

DNS and all the additions seem to be really not easy to handle.

by macNchzon 1/6/24, 2:18 PM

I was a big skeptic earlier on, but I have come to appreciate systemd in recent years. That said, the section on resolved is a big one–it has been one of the only things that consistently causes me annoyance over the three years since I started primarily using Linux on the desktop.

After one particularly deep rabbit hole where I never actually solved a problem where local network mDNS lookups always took exactly 6 seconds, I wound up disabling it entirely in favor of unbound, which works great but revealed that a handful of other things seem to expect resolved to be present and won’t work properly without it.

by sevaghon 1/6/24, 1:56 PM

Systemd is Godsent for people who have to administer diverse Linux boxes.

I can see how people who want to run their own tight ship on their machine would balk at it. I also sort of hate the systemd-resolv and the fact that it instantly rendered worthless 99% of online guides of how to unfuck your local DNS resolution.

by kemotepon 1/6/24, 2:17 PM

> I prefer my init scripts to handcrafted by local artisans. Each time the computer boots up is an objet d’art.

In all seriousness this is a wonderful article. Has the author/op seen this talk[0] by a FreeBSD developer on systemd?

What I like about the talk is asking about what is going to happen when we need to move on to newer ways of doing things that will have advantages and disadvantages over the old way. We can see a similar story with X11 and Wayland. Systemd does do some things better than the old way but does have flaws.

[0]:https://youtu.be/o_AIw9bGogo

by frankjron 1/6/24, 3:41 PM

I generally embrace systemd and have been pretty happy with it but there's one component which simply doesn't work correctly and that's systemd-resolved in combination with DNSSEC. I eventually had to replace it with Knot Resolver which works flawlessly on the same machine / network.

https://github.com/systemd/systemd/issues/9867

https://www.knot-resolver.cz/

by habitueon 1/6/24, 1:52 PM

Maybe a good way to get competition is to give someone a possibly dubious (but fun!) goal of writing a systemd compatible alternative in rust.

Reimplementing such a massive piece of infrastructure is pretty daunting. I don't think many people who are not being paid to would embark on it and then keep up the momentum to actually cover all those edge cases and maintain a community around it.

Rewriting things in rust seems to be pretty motivating though, so maybe it could be a force for good here

by ofrzetaon 1/6/24, 1:45 PM

I've found systemd-nspawn to be a great alternative to the olde chroot method for fixing systems in a rescue shell. systemd-nspawn -D /mnt mounts the system like chroot but the real value comes with systemd-nspawn -b -D /mnt/ that actually boots the system in a container.

by t43562on 1/6/24, 2:13 PM

I'm enjoying artix linux with dinit. I tried a couple of other alternatives - s6 and openrc - although I wouldn't say I was that fair on openrc.

dinit may lack many things - I don't know - I haven't had any reason to be unhappy with it and it seems to work in a fairly understandable way. I defined a new service (for minidlnad) and that appears to be straightforward.

by nijaveon 1/6/24, 2:59 PM

More structured metadata in journald is nice and I don't have any qualms with binary logging formats, but the author does have a point on log shipping. Additionally, reading journal files has always seemed painfully slow to me (30 seconds of 100% CPU usage)

I'm not overly found of timer units, either. They seem feature rich but much more complicated to setup than a single line in a crontab.

On the other hand, real dependency management between daemons, mounts, and sockets is a huge win.

I can't quite remember the right terminology but afaik systemd-resolvd supports routing different domains to different DNS servers on different network interfaces which can be auto configured via DHCP. The practical implication being, you can connect to a split tunnel VPN and domains accessed over the VPN get routed to the VPN DNS server.

by ivolimmenon 1/6/24, 3:37 PM

I like the honesty of the article. I never got the grunge others have against it. I am a software engineer and work in office automation (a.k.a. the boring stuff). On Linux I am a poweruser. If Ubuntu used initd I use it. They switched to systemd so I use that. I liked it. Making an application start on reboot was easy to do. And easier than with the rc files 20 years back.

by remramon 1/6/24, 3:36 PM

I think the problem of systemd is that it is complex. It is fine when it works correctly, but when you eventually have to dig into any sort of problem, it is hell.

For example, all of my boxes spew many logs per minute about "Failed to set up mount unit". This is apparently a bug with the generated name of some internal unit related to mountpoints (too long). This one is not as bad as others, because it does have a bug reported for it, but is still something I'll have to deal with until I upgrade the distro on all the servers. Many similar bugs I can't track down at all.

Systemd makes a lot of things easy, but not simple, and that is a big problem in practice.

by gavinhowardon 1/6/24, 3:56 PM

I think the author is right on just about everything.

But the author is especially right on the need for competition. So I am working on it.

I am building a build system, and I am making it usable as a library. When I have made it so, I will implement an near drop-in [1] replacement for systemd.

You'll be able to use systemd unit files, and there will be an option to have binaries with the same name (off by default to not interfere, though) so that users don't have to learn new stuff right away.

[1]: I am not going to implement journald for example; logs will be text, even if they are structured internally.

Besides, implementing things to be perfectly compatible would only strengthen the monoculture, not weaken it. The plan is to be compatible on the things that matter for distro integration, so that distros can ship both with little work, but have other things different, so that users can choose what works best for them between the two.

by t43562on 1/7/24, 5:23 PM

Linux isn't what it was - it's contributed to via a lot of companies with commercial interests. The open source developers who started doing it all for love are often doing it for money now. Certain features are desired, there's a way to do it, they do it.

The actual "customers" of linux now - the ones providing the money that employs people - aren't Linux enthusiasts but big companies. So it's absolutely obvious that their needs are "what matter".

It's ok. The rest of us can fork and "be irrelevant" in the same way Linux was irrelevant before it became popular.

by WesolyKubeczekon 1/6/24, 5:38 PM

I would in fact love a portable user-level process manager that can speak systemd unit files and is portable.

Something like supervisord, but you use systemd unit files, and it tries to do as much as it can within limitations (process tracking sure can be wonky, no dbus everywhere, no cgroups everywhere, no absolute freedom in resource limitations if you’re not PID 1).

by traversedaon 1/6/24, 3:55 PM

I originally wrote this comment for reddit, but I feel like hackernews might appreciate it. If you think it's too long or detracting from the article or something feel free to downvote, I won't be offended.

Here are some of my criticisms, although I do still use systemd daily on my personal devices and servers.

* Bad security

Systemd is architected in a way that has a lot of code running as root, it's also written in a language that isn't memory safe. This means it has a large attack surface (a lot of code you need to make sure is bug free to be secure) and it's harder to make sure your code doesn't have really severe security related bugs (The [NSA recommends](https://www.nsa.gov/Press-Room/Press-Releases-Statements/Pre...) using memory safe languages for critical stuff like this).

There are certainly other critical projects (like this linux kernel) that are similarly important and written in memory unsafe languages, but systemd has had some [pretty critical vulnerabilities](https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=systemd) discovered that do not inspire confidence in their ability to use these kinds of dangerous languages safely, and they don't have nearly the same budget as the linux kernel for detecting and preventing these kinds of issues.

This is less of a problem if you're a large enterprise customer running up to date SELinux, but it should still have been possible to write systemd in a way that limited the pid1 attack surface while retaining all current functionality.

* Journalctl is a pain for desktop users and smaller teams

Journalctl is how systemd manages logs. By default it saves logs in a binary format with additional metadata. This makes it easier for large teams to ingest the logs into a centralized log-collection daemon, like the ones offered by redhat for enterprise deployments, but it breaks a lot of workflows that older sysadmins probably used. Things like just rsync-ing a bunch of logs to one place, or using tools like grep and find to inspect logs. Systemd does of course provide replacements for those tools, instead of using grep you can use journalctl to search through your logs, but you could use grep to search through any text file. Config files, source code, or logs. Now I need to memorize all the flags for one more tool, and change a bunch of stuff about how I collect logs.

This also presents a challenge for people doing embedded work, as you can't just grab the SD card out of a system can look at it's logs. You need a working journalctl CLI on your host machine. They've add in a --directory flag, but in the past this was much harder, requiring you to actually chroot into the embedded system (that may be broken in strange ways) in order to read logs.

Journalctl has advantages, but they're mostly enjoyed by large enterprises.

Yes I know it's actually systemd-journal or just "journal" or whatever they call it. Journalctl is the command most users will be familiar with though.

* Poor [Locality of behavior](https://htmx.org/essays/locality-of-behaviour/) makes it harder to reason about

When you're trying to understand how a system works it's nice to be able to see everything in one places. There are like 9 different places a systemd unit file can live, you can apply an over-ride to a unit file, unit files can depend on other files like socket files.

This is good for large teams as it makes it easier for specific groups in a company to claim ownership over parts of the system, but it means as a desktop user or sysadmin working with a small business you've added a lot of complexity. You can't just type `ls /etc/init.d` to get a rough overview of what services exist, you need to memorize more systemd-specific commands. If you want to edit a service you can't just edit a service, you need to create an over-ride using another systemd specific command, make sure you have the EDITOR environment variable set up, and then open the original service in another editor so you can compare the two.

It creates some more work and complexity and encourages you to use a bunch of systemd-specific tools (and presumably get red-hat certified training).

* People use systemd stuff before it's ready

I'm not sure this is something I can blame systemd or redhat for, but the official stance of redhat is that systemd-resolved is not ready for production, and yet it's used all over the place. That can give people a poor impression of systemd after the 9th time they try to do something even slightly different with their networking setups and systemd-resolvd breaks, not to mention the numerous security issues in systemd-resolvd.

* Unix philosophy

A lot of people say that either unix philosophy doesn't matter, or that systemd does embrace unix philosophy. That stuff about logging and systemd-specific tools I mentioned above? That's what people actually mean when they talk about unix philosophy, they mean being able to grep through their logs and rsync logs to a remote server. They mean using standard text files and not needing to have a special command that wraps your text editor to "properly" edit a unit file.

* Doesn't run in a chroot

As someone who splits my time between embedded linux and server linux this is just a personal pet peeve of mine. It makes it very hard to debug embedded systems that use systemd if you're not using systemd on your workstation. It does feel like I'm being forced to use systemd sometimes, and while I've largely gotten over it I'm still a bit bitter. It's also made some small personal projects, like getting a full linux distro running on a KoBo e-reader, much much more difficult than they had to be.

It's a mess under docker, and why I need to use alternative OCI-runtimes like nestybox to do a bunch of testing for embedded systems. Thankfully I wasn't an early adopter to docker and didn't have those problems until there were already mature solutions. But there's really no reason why it should have to run as pid1, other than them wanting you to use docker-alternatives that are deeply integrated with systemd, like their podman tool or systemd-nspawn. This was just such a blatant attempt to abuse their near-monopoly position that it bears some extra whining.

* OpenRc does everything systemd does, but better

Unfortunately other red-hat influenced projects like Gnome won't support or test on non-systemd init systems, let alone providing default services files for them, meaning that any distro that wants to be compatible with gnome will need to do a bunch of extra work to write and test service files. For one project that's potentially reasonable, and certainly there are distros that do that extra work, but for projects like Arch who have the explicit goal of sticking as close to upstream sources as possible it makes it more or less impossible.

Systemd survives not because it's a good solution to the problem, but because it has a large corporate backer, is widely deployed, and is a safe thing to code against. There's a very old IT saying, "No one ever got fired for buying IBM". If you pick a safe industry-standard options no one can blame you if it goes wrong, even if it's the technologically inferior option.

As much as I'm a systemd-hater I still do use it on my personal devices and servers, because it's by far the path of least resistance.

I hope we see a similar situation like with pulseaudio and pipewire, where the pulseaudio rewrite was much much nicer than the original. I don't think that's going to happen until systemd slows down though, right now if you tried to re-implement systemd I suspect you'd get the rug pulled out from under you as they changed standards and behaviors (I've seriously thought about doing it myself, at least for relatively simple unit files). I'd still prefer to be using OpenRc, as I don't know how a rewrite would deal with the locality-of-behavior issues, but systemd has been getting better and more reliable over time.

by Philpaxon 1/6/24, 2:16 PM

This is quite a balanced take on the matter. Appreciated!

by dijiton 1/6/24, 1:08 PM

Before this topic boils down to: SystemD gave me cancer/SystemD cured my aids as it always seems to. Please permit me get my opinion across (as it is more nuanced) before I continue.

I sincerely believe that systemd solved a problem that nobody was willing to solve, and it has every right to solve those problems any way it wants (when you ask for help you don't get to choose how you are helped after all) - my concerns boil down to the fact that the adoption it has seen has lead indirectly (or, directly) into a monoculture; and a monoculture that almost certainly will stifle innovation since a replacement will need to be bug-for-bug compatible. Prior init's were actually quite easy to replace and alternatives were often used, but these days most software assumes systemd and this situation gets worse every year.

That said; and with the knowledge that while I am afraid of systemd as monoculture (and a relatively opaque one); this line "systemd, as a service manager, is not actually a bad piece of software by itself. The fact it can act as both a service manager and an inetd(8) replacement is really cool."

Is something I vehemently disagree with.

For starters, (x)inetd is an anti-pattern, everything I understand about systems development indicates we should be seperating concerns as much as possible, having one super-server that launches everything under one daemon is directly opposed to this.

"But", I hear you thinking already: "systemd runs services as independent users, it solves that!", and I would agree with you, except now pid 0 is listening to the network instead.. That doesn't strike me as much better.

If there's a bug/backdoor in your binary distributed version of systemd then you are SOL and your whole system is owned as root, but at least a bug in your application might not expose your entire inetd user. :\

It's also a common issue that inetd's architecture can lend itself to getting DoS'd harder than other more-standard daemons, except now it's your pid 0 being DoS'd; not sure how you recover from that honestly.

EDIT: if you are going to downvote, please provide reasons. Sick of this holy war, lets just end it reasonably please.

by smitty1eon 1/6/24, 5:00 PM

The Famous Article (TFA) is an exemplar of dispassionate substantive, liberal criticism.

The shortcomings of systemd seem likely the overall shortcomings of open source: none of the $PROJECT maintainers experience $PAIN_POINT, so it is simply not a priority for $PROJECT.

Somehow this seems a variation on the tragedy of the commons => https://en.m.wikipedia.org/wiki/Tragedy_of_the_commons

Without some capitalist skin in the game, the $PAIN_POINTS become difficult to prioritize.

by peter_d_shermanon 1/6/24, 7:07 PM

Most people (including most Unix greybeards!) really don't understand Unix's 'init' (AKA "the init daemon", "the init process", etc., etc.) -- much less any of its massively-increasing-in-LOC (and complexity!) successor programs...

So we need to start with 'init'.

'init' -- even in its absolute first, simplest incarnation -- is still too complex to understand correctly!

You see, we need to shift perspectives!

We need to shift perspectives from a longtime System Administrator -- to that of a new barebone OS programmer.

What is 'init'?

Is 'init' a program that handles runlevels, starts and stops services, that mounts filesystems, that processes messages, that captures dead processes, that waits for hardware to become available, that logs and maintains informational/database/etc files, that starts audio, that starts X11, that stars the GUI, that acts as a proxy for sockets, or does anything else with the system?

No!

From the point of view of a new barebone OS programmer (as Dennis Richie and Ken Thompson were when they invented Unix and invented 'init') -- 'init' is NONE of these things!

'init' is only THE FIRST PROGRAM, THE FIRST COMPUTER CODE THAT RUNS IN USER SPACE.

And that's it!

That is all that 'init' ever is, or ever was!

(User space, to recap, is the unprotected AKA "unprivileged" AKA "non-supervisor" memory running unprotected (AKA "user-land") code: https://en.wikipedia.org/wiki/User_space_and_kernel_space)

'init' (and every single 'init' successor program, i.e., OpenRC, systemd, etc.) -- are the first program, the first set of computer code OUTSIDE OF KERNEL CODE (which has been running and is currently still running) to be run by the system.

Now, what should that first program do?

See, that's the magic question -- which gives rise to all that is to follow!

In theory you could have an OS where the 'init' program, or its equivalent -- did absolutely nothing! But that wouldn't be very productive!

If the 'init' program isn't itself a shell program (i.e., sh, bash, etc.) -- then (because there's no GUI at this point) the computer will not be able to accept typed command-line commands -- which is the first thing that you want a new OS to do!

So now our 'init' expands in scope (and lines of code)!

Our 'init' could be hardcoded to launch 'sh' or 'bash' (or whatever shell program exists) -- but what if the user wants to change that?

OK, so now we need our first configuration file. Where to put that exactly?

Oh, it's on a filesystem that hasn't been mounted yet?

Well, maybe init should mount that filesystem!

Point is, there's a set of problems (and sub-problems!) -- which give rise to increasing and increasing init's functionality over time!

init, as the first user-space program for an OS to run, on whatever OS it is ran on, in whatever form it is in -- could simply be written to run and 'outsource' all of its functionality to other programs...

But init (as it evolved into its very large LOC complex descendants) -- became a "dumping ground" -- for functionality that was inconvenient to go in other places and/or to be outsourced to other programs.

See, all of the code in all userland Linux utilities -- could in theory be grafted together into one big super program in userspace.

It would have the same functionality as all of the individual Unix/Linux command-line programs put together (and maybe that would be desirable to some people). But from a Software Engineering "separation of concerns" AKA dependency reduction AKA modularity AKA "do one thing and do it right" AKA loose-coupling perspective -- doing that might not be so desirable!

And yet, with the complexity brought about by 'init' descendants -- it seems like we're going down that exact route!

Which leads us full circle (because history always repeats itself!) -- back to the reason why Unix was created -- because of the complexity and problems brought about by the complexity of its predecessor, Multics!

https://en.wikipedia.org/wiki/Multics

Point is -- 'init' in whatever form it takes -- is by no means obligated to do anything -- although if it is to do nothing, then it should at least launch one other program which will do something! If that's the case, then why not put that under user control? But wait, if we're doing that, why not make it launch multiple other programs! OK, now we need a file to tell init where that should be! But what if the filesystem for that file is not mounted?

Anyway, you see how the "rabbit hole" of problems (and increasing LOC complexity) forms!

Related: https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-a...