Use a minimal initrd to switch to the full initrd stored in /usr #3241

pothos · 2025-09-03T12:53:48Z

The growth of binaries over time and the inclusion of new features
filled the available boot partition space, so that the kernel+initrd
almost couldn't fit twice anymore as required for updates. We employed
workarounds such as wrapper scripts for ignition, afterburn and other
binaries so that they are loaded from /usr. However, this was still not
enough and we would have to do the same for (network) kernel modules and
firmware. To avoid making this ever more complex we can use a dedicated
initrd focused on loading the full initrd from /usr and then this full
initrd can use dracut as before and even drop all the workarounds we
accumulated.

Generate a minimal initrd to use instead of the full bootengine initrd.
The bootengine initrd gets stored as squashfs on /usr. The minimal
initrd still includes the early_cpio for amd64 microcode updates.
We have a fixed list of modules or module directories to include, only
focused on loading /usr and any emergency console interaction. This
requires also checking for module dependencies to copy over.
The busybox, veritysetup, and kmod binaries are needed and get their
required libraries resolved and copied over. They are not static and
use shared libraries which should be ok for now. The resulting vmlinuz
file is 27 MB for amd64, down from ~60 MB, so we have enough room to
include more kernel modules and so on for the next years while we also
grow the boot partition and wait for users to redeploy until we can rely
on a larger boot partition and eventually drop the minimal initrd again.

Pulls in flatcar/bootengine#110 for the
minimal initrd script and flatcar/seismograph#12
for making the device mapper discovery for the "rootdev" command more
reliable.

This also requied a backport of a kernel patch from 2017 that exposes
the PARTUUID in the /sys uevent file.

How to use

Depends on flatcar/bootengine#110 and flatcar/seismograph#12

And flatcar/flatcar-build-scripts#174 for the image size report (but that only works when this is included in the first nightly)

Testing done

On all clouds (Equinix Metal arm64 was manually tested) - The build got gc'ed, a more limited new run is here

The bootengine.img initrd size/content reporting only works after the first nightly is built.

Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

pothos · 2025-09-03T13:00:02Z

@chewi My idea is to load the "normal" initrd as loopback mount from /usr and switch to it for Ignition, network drivers and so on. If you want we could try to plug your busybox experiments in on this branch. To keep things simple and avoid risking breakage I would assume that the minimal initrd still ships the CPU microcode for the kernel to load because when we load it from userland there are more things to be aware of that aren't well supported, as far as I remember. Also, the /usr verity mount should be done by the minimal initrd and we would bind mount it into the "normal" initrd mount to reuse it (or maybe dm-verity also has no problems doing the work twice? For performance reasons we still might want to only do it once). In the end we could even drop the wrappers if we start to include afterburn and ignition (and other wrappers) again in the initrd and remove it from /usr where they aren't actually needed.

github-actions · 2025-09-03T13:00:16Z

Build action triggered: https://github.com/flatcar/scripts/actions/runs/18366905408

chewi

Although I shared the same concern about losing functionality that we would have to reimplement, I hadn't yet identified any such functionality, so I'm not quite ready to throw out my proposal to go straight from tiny initrd to real /usr. I'd really like to know what your specific concerns are.

This is an interesting approach in any case. My own alternative would have been to mount /usr as an overlay with the initrd, deleting all the duplicate files from the initrd, but I hadn't fully thought it through.

Regarding verity, I think it only needs to be set up once. I didn't enable verity in my own experiment, but /sysroot/usr was simply a bind mount of /usr. I think that would still work with verity applied.

...ntainer/src/third_party/coreos-overlay/sys-kernel/coreos-kernel/coreos-kernel-6.12.44.ebuild

pothos · 2025-09-03T15:53:37Z

my proposal to go straight from tiny initrd to real /usr. I'd really like to know what your specific concerns are.

My intention was to keep most things untouched so that we can focus on the bare task of jumping into the regular initrd and avoid any risk of reimplementing all needed initrd logic. Things that should run from the initrd are: Ignition stages, hostname setup with afterburn (and basic network setup for them while they prepare the final network setup for the real system), setup of the /etc overlay, A/B sysext setup for OEM and extra sysexts (inc. fallback download), encrypted rootfs unlocking, generation of host-specific /etc files, disk uuid init, first-boot detection, and propably other stuff I don't remember. I think we should rely on the current code for almost all of it to avoid breaking things.

chewi · 2025-09-03T16:01:36Z

Okay, but I wasn't proposing rewriting all that. Dracut puts those scripts into an initrd. I was just going to put them in /usr instead. It's more or less the same thing. It's the scripts that Dracut itself provides through its own modules that I was concerned about.

pothos · 2025-09-04T03:06:57Z

The question is on how these things are started because they run in a context with dependencies. Having only one set of systemd units for both the initrd and the final system doesn't work if we want to make use of systemd in the initrd - it would run all enabled units under /usr (unless we inject the initrd check into most of them). Creating a separate environment manually without dracut we would have more work and risk compared to moving the current initrd as a whole. When we get things working and have time left we can still try this out as optimization (preparing an /etc for the initrd stage with /etc/initrd-release, pulling in the ignition systemd units and so on, and with masking or customizing all unnecessary units through drop-ins).

chewi

Looking good! I'm tentatively approving this, just a couple of things to consider.

You can drop the sudo calls. RESTRICT="userpriv" means we're already running as root because Dracut needs it. If we didn't have that, sudo wouldn't work anyway.

I'm now somewhat confused about the compression. The kernel documentation says that you're only supposed to pass a single cpio to CONFIG_INITRAMFS_SOURCE. It also says the early cpio must not be compressed. We have been telling Dracut not to compress, so what we've been providing has been totally uncompressed. Copilot says that the kernel build isn't smart enough to only compress the main part, but it also says that the uncompressed early cpio rule only applies when you're passing the initramfs separately at boot time, not when it's built in. I suppose that must be true, since it appears that we've been compressing the whole thing via the kernel build. What you've proposed doesn't change that, but I thought it would be a good opportunity to write this down and check that we're all on the same page.

pothos · 2025-10-08T12:21:05Z

Ok, dropped the sudo calls.

Yes, good question - I assume that the kernel build system knows whether the first cpio can be compressed or not. We could check with real hardware if we get the microcode update applied or not (with any Flatcar release as we didn't change this).

chewi · 2025-10-08T12:27:52Z

Yes, probably best to check that the microcode actually works, not merely whether we've changed anything. The microcode was actually missing entirely until I fixed that a few months back! See #2837.

pothos · 2025-10-09T03:08:27Z

Yes, I think I tested it but that was with the truncation - I don't remember the details (Edit: Tested now and it doesn't seem to work either). After the changes with the new lsinitrd extraction I didn't test it again and just see now that it doesn't seem to work.

pothos · 2025-10-09T03:22:29Z

Confusing, it doesn't work on Alpha either but on Stable I've seen it applied.

pothos · 2025-10-09T03:35:50Z

Microcode updating is also not working in Beta.

Sounds like the behavior changed in the PR you linked.
Stable doesn't have your changes and prints this at boot:

[    2.835111] microcode: Current revision: 0x00000100
[    2.840016] microcode: Updated early from: 0x000000f4

But Beta has your changes and it prints:

[    5.932758] microcode: Current revision: 0x000000f4

pothos · 2025-10-09T04:15:27Z

So I guess the current way of passing it in does indeed not work and we need to change this. But not in this PR.

pothos · 2025-10-09T05:53:23Z

I created a bugreport for it: flatcar/Flatcar#1909

The growth of binaries over time and the inclusion of new features filled the available boot partition space, so that the kernel+initrd almost couldn't fit twice anymore as required for updates. We employed workarounds such as wrapper scripts for ignition, afterburn and other binaries so that they are loaded from /usr. However, this was still not enough and we would have to do the same for (network) kernel modules and firmware. To avoid making this ever more complex we can use a dedicated initrd focused on loading the full initrd from /usr and then this full initrd can use dracut as before and even drop all the workarounds we accumulated. Introduce a busybox init script that prepares a minimal environment, has debug toggles and an emergency shell, and only loads the real initrd from /usr to switch over to it. Because mdev is not a proper udev replacement, some additional scripting is needed. Busybox's modprobe can't work with dependencies well and we need the real kmod for that (which is also good to guarantee have the same modprobe options set). Also, some other busybox commands are often lacking things such as loading a kernel module automatically and this has to be done explicitly. We still set up dm-verity for /usr so that we have the same security properties (The code comes from the bootengine systemd generators we have and also covers the PXE boot with a squashfs /usr passed from an additional cpio). The real initrd then reuses the mount point for /usr, and loads any kernel modules and firmware that wasn't loaded already. We also have to make the dependencies for parse-ip-for-networkd.service a bit more explicit because the removal of the /sysusr mount in the full initrd exposed a race condition. ## How to use With flatcar/scripts#3241 ## Testing done See above

The growth of binaries over time and the inclusion of new features filled the available boot partition space, so that the kernel+initrd almost couldn't fit twice anymore as required for updates. We employed workarounds such as wrapper scripts for ignition, afterburn and other binaries so that they are loaded from /usr. However, this was still not enough and we would have to do the same for (network) kernel modules and firmware. To avoid making this ever more complex we can use a dedicated initrd focused on loading the full initrd from /usr and then this full initrd can use dracut as before and even drop all the workarounds we accumulated. Generate a minimal initrd to use instead of the full bootengine initrd. The bootengine initrd gets stored as squashfs on /usr. The minimal initrd still includes the early_cpio for amd64 microcode updates. We have a fixed list of modules or module directories to include, only focused on loading /usr and any emergency console interaction. This requires also checking for module dependencies to copy over. The busybox, veritysetup, and kmod binaries are needed and get their required libraries resolved and copied over. They are not static and use shared libraries which should be ok for now. The resulting vmlinuz file is 27 MB for amd64, down from ~60 MB, so we have enough room to include more kernel modules and so on for the next years while we also grow the boot partition and wait for users to redeploy until we can rely on a larger boot partition and eventually drop the minimal initrd again. Pulls in flatcar/bootengine#110 for the minimal initrd script and flatcar/seismograph#12 for making the device mapper discovery for the "rootdev" command more reliable. This also requied a backport of a kernel patch from 2017 that exposes the PARTUUID in the /sys uevent file. Co-authored-by: James Le Cuirot <[email protected]> Signed-off-by: Kai Lueke <[email protected]>

pothos had a problem deploying to development September 3, 2025 12:53 — with GitHub Actions Error

chewi requested changes Sep 3, 2025

View reviewed changes

...ntainer/src/third_party/coreos-overlay/sys-kernel/coreos-kernel/coreos-kernel-6.12.44.ebuild Outdated Show resolved Hide resolved

pothos had a problem deploying to development September 4, 2025 05:42 — with GitHub Actions Error

pothos force-pushed the kai/initrd-in-usr branch 14 times, most recently from 4306d75 to 0bfc20a Compare September 12, 2025 14:08

pothos mentioned this pull request Sep 12, 2025

Use a minimal initrd to switch to the full initrd stored in /usr flatcar/bootengine#110

Merged

pothos force-pushed the kai/initrd-in-usr branch 4 times, most recently from b250dfa to 647190c Compare September 15, 2025 16:25

pothos temporarily deployed to development September 15, 2025 16:25 — with GitHub Actions Inactive

pothos force-pushed the kai/initrd-in-usr branch from 647190c to 3561af4 Compare September 16, 2025 03:16

pothos temporarily deployed to development September 16, 2025 03:17 — with GitHub Actions Inactive

pothos had a problem deploying to development October 7, 2025 15:21 — with GitHub Actions Error

pothos force-pushed the kai/initrd-in-usr branch from 777af55 to 55490bb Compare October 7, 2025 15:23

pothos had a problem deploying to development October 7, 2025 15:24 — with GitHub Actions Error

pothos force-pushed the kai/initrd-in-usr branch from 55490bb to d1f0555 Compare October 7, 2025 15:28

pothos had a problem deploying to development October 7, 2025 15:28 — with GitHub Actions Error

pothos force-pushed the kai/initrd-in-usr branch from d1f0555 to 360aa17 Compare October 7, 2025 15:32

pothos temporarily deployed to development October 7, 2025 15:32 — with GitHub Actions Inactive

pothos requested a review from chewi October 7, 2025 16:01

pothos force-pushed the kai/initrd-in-usr branch from 360aa17 to 2323e9c Compare October 8, 2025 07:18

pothos had a problem deploying to development October 8, 2025 07:18 — with GitHub Actions Error

chewi approved these changes Oct 8, 2025

View reviewed changes

pothos force-pushed the kai/initrd-in-usr branch from 2323e9c to f7eaf89 Compare October 8, 2025 12:16

pothos had a problem deploying to development October 8, 2025 12:16 — with GitHub Actions Error

pothos force-pushed the kai/initrd-in-usr branch from f7eaf89 to 5f1944b Compare October 9, 2025 05:56

pothos had a problem deploying to development October 9, 2025 05:56 — with GitHub Actions Failure

pothos merged commit eb3aadd into main Oct 9, 2025
1 of 5 checks passed

pothos deleted the kai/initrd-in-usr branch October 9, 2025 05:57

dongsupark mentioned this pull request Oct 10, 2025

overlay afterburn: update to 5.10.0 #3352

Merged

2 tasks

github-actions bot mentioned this pull request Oct 22, 2025

Monthly contributions report 2025-09-22 - 2025-10-21 flatcar/Flatcar#1927

Open

chewi mentioned this pull request Oct 27, 2025

New minimal initrd is printing "Invalid ELF header magic: != ELF" errors flatcar/Flatcar#1934

Closed

ader1990 mentioned this pull request Nov 11, 2025

[RFE] Flatcar boot partition size shrink effort flatcar/Flatcar#1381

Closed

Use a minimal initrd to switch to the full initrd stored in /usr #3241

Use a minimal initrd to switch to the full initrd stored in /usr #3241

Uh oh!

Conversation

pothos commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to use

Testing done

Uh oh!

pothos commented Sep 3, 2025

Uh oh!

github-actions bot commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chewi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pothos commented Sep 3, 2025

Uh oh!

chewi commented Sep 3, 2025

Uh oh!

pothos commented Sep 4, 2025

Uh oh!

chewi left a comment

Choose a reason for hiding this comment

Uh oh!

pothos commented Oct 8, 2025

Uh oh!

chewi commented Oct 8, 2025

Uh oh!

pothos commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pothos commented Oct 9, 2025

Uh oh!

pothos commented Oct 9, 2025

Uh oh!

pothos commented Oct 9, 2025

Uh oh!

pothos commented Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pothos commented Sep 3, 2025 •

edited

Loading

github-actions bot commented Sep 3, 2025 •

edited

Loading

pothos commented Oct 9, 2025 •

edited

Loading