Skip to content

fix: remove AzureLinux 3.0 modprobe LPE blacklist (CSE-time + VHD bake-in) — kernel 6.6.139.1-1.azl3+ fixes upstream#8546

Merged
djsly merged 10 commits into
mainfrom
djsly/38070527-remove-azl3-lpe-mitigation
May 22, 2026
Merged

fix: remove AzureLinux 3.0 modprobe LPE blacklist (CSE-time + VHD bake-in) — kernel 6.6.139.1-1.azl3+ fixes upstream#8546
djsly merged 10 commits into
mainfrom
djsly/38070527-remove-azl3-lpe-mitigation

Conversation

@djsly

@djsly djsly commented May 21, 2026

Copy link
Copy Markdown
Collaborator

Summary

Three kernel LPE vulnerabilities tracked in Azure/AKS#5753 are now fixed upstream in the AzureLinux 3.0 kernel as of 6.6.139.1-1.azl3. The corresponding modprobe blacklist mitigations are no longer required on AzureLinux 3.0 and have been fully descoped on that OS — both the CSE-time runtime apply and the VHD-build bake-in.

Vulnerability CVE(s) Modules
Copy Fail CVE-2026-31431 algif_aead
DirtyFrag CVE-2026-43284, CVE-2026-43500 esp4, esp6, rxrpc
Fragnesia CVE-2026-46300 esp4, esp6 (covered by DirtyFrag)

Only AzureLinux 3.0 (regular and Kata) is descoped, because kernel 6.6.139.1-1.azl3 fixes all three CVEs upstream AND customers reported the blacklist actively blocks legitimate workloads that need those modules. Ubuntu and AzureLinux OSGuard mitigations are unchanged (defense-in-depth retained — OSGuard workloads do not require the affected modules). AzureLinux 2.0 (Mariner) keeps the CSE-time runtime apply because those VHD images are frozen at 202512.06.0 (see FrozenCBLMarinerV2AndAzureLinuxV2SIGImageVersion) and cannot pick up new modprobe-CIS.conf entries via VHD refresh — the runtime apply is the only path to deliver the mitigation to AzL2 nodes.

Decision rationale

Initial iteration of this PR kept the four install <mod> /bin/false + blacklist <mod> entries baked into AzureLinux 3.0 VHDs as defense-in-depth, removing only the CSE-time runtime apply. Customer feedback during review made it clear that legitimate AzL3 workloads require some of those kernel modules (notably the esp* modules for IPsec/XFRM use cases and rxrpc for AFS-style RPC see cilium/cilium#46023). With the upstream kernel fix shipping, keeping the blacklist baked in is no longer defense-in-depth — it is an active regression. This PR therefore removes the bake-in on AzureLinux 3.0 too.

Scope

OS Before this PR After this PR Reason
Ubuntu 22.04 / 24.04 runtime apply + bake-in unchanged (apply + bake-in) Upstream kernel fix not yet shipped
AzureLinux 2.0 / Mariner (OS_VERSION=2.0) runtime apply + bake-in unchanged (apply + bake-in) AzL2 SIG images frozen at 202512.06.0 — cannot deliver new bake-in via VHD refresh, so the runtime apply is retained as the only mitigation path
AzureLinux 3.0 regular + Kata (azurelinux, azurelinux-kata) runtime apply + bake-in runtime apply skipped + bake-in removed Kernel 6.6.139.1-1.azl3+ fixes all three CVEs; blacklist blocks legitimate workloads
AzureLinux OSGuard (AzL3-OSGuard) runtime apply + bake-in unchanged (apply + bake-in) Hardened secure-boot variant — defense-in-depth retained; OSGuard workloads do not require the affected modules
Flatcar / ACL never in scope unchanged Never applied
Windows N/A N/A Never affected

What this PR does NOT do

Customers running existing in-support AzL3 VHDs will continue to have the blacklist baked in until they upgrade to a newer VHD; no CSE-time active removal is implemented in this PR.

The four-module blacklist will simply not be present on newly-built AzL3 VHDs going forward. Existing in-support AzL3 VHDs (built before this change merges) keep the baked-in /etc/modprobe.d/CIS.conf blacklist entries. We did not add a CSE-time rm or rewrite-on-boot path that would actively scrub pre-existing blacklist files on already-deployed nodes — that was considered and explicitly rejected to avoid mutating in-place security configuration on already-provisioned fleet. Affected customers will pick up the unblocked configuration when they roll their node pools to a newer AzL3 VHD.

Backward-compat analysis (6-month VHD window)

  • New CSE on old AzL3 VHD (still has bake-in): CSE-time gate skips the runtime apply on AzL3 regular/Kata. The pre-existing /etc/modprobe.d/CIS.conf is left in place. Net effect: the four modules remain blocked on old AzL3 VHDs until the customer rolls. Kernel-fix-only mitigation kicks in once they upgrade.
  • Old CSE on new AzL3 VHD (no bake-in): old CSE would have called disableVulnerableKernelModule for the four modules and written /etc/modprobe.d/disable-<mod>.conf drop-ins itself. Until both pieces (new CSE and new VHD) are in the field for an AzL3 customer, the runtime CSE writes will still block. Once both ship, customers requiring those modules on AzL3 see the unblocked configuration.
  • Ubuntu / OSGuard / AzL2-Mariner: unchanged — runtime apply still runs, bake-in still present.

Final CSE-time OS gate

if isUbuntu "$OS" \
   || isAzureLinuxOSGuard "$OS" "$OS_VARIANT" \
   || { isMarinerOrAzureLinux "$OS" && [ "${OS_VERSION}" = "2.0" ]; }; then
    disableVulnerableKernelModule "algif_aead" ...
    # esp4, esp6, rxrpc
fi

Applies on: Ubuntu (all), AzureLinux OSGuard, AzureLinux 2.0 / Mariner. Skips on: AzureLinux 3.0 regular/Kata, ACL, Flatcar.

Files changed

File Change
parts/linux/cloud-init/artifacts/cse_main.sh CSE-time OS gate is now isUbuntu || isAzureLinuxOSGuard || (isMarinerOrAzureLinux && OS_VERSION=2.0). Drops AzL3 regular/Kata (kernel fixed); keeps Ubuntu, OSGuard, and AzL2/Mariner (the latter because its VHD image is frozen and can't deliver new bake-ins). Comment block rewritten with full rationale.
vhdbuilder/packer/packer_source.sh cpAndMode $MODPROBE_CIS_SRC … now wrapped in if isAzureLinux "$OS" "$OS_VARIANT" && [ "${OS_VERSION}" = "3.0" ] && ! isAzureLinuxOSGuard … skip → else copy. OSGuard explicitly retains the bake-in. Ubuntu, Mariner, ACL, Flatcar paths unchanged byte-for-byte.
vhdbuilder/packer/test/linux-vhd-content-test.sh testVulnerableKernelModulesDisabled takes $OS_SKU $OS_VERSION and asserts ABSENCE of both install <mod> /bin/false and blacklist <mod> directives on AzL3 (os_sku=AzureLinux && os_version=3.0), presence + load-refusal otherwise. OSGuard's OS_SKU is AzureLinuxOSGuard — distinct from AzureLinux — so it falls through to the full presence check correctly.
e2e/validators.go ValidateVulnerableKernelModulesDisabled is now OS-conditional: AzL3 regular (NOT OSGuard, via !s.VHD.Distro.IsAzureLinuxOSGuardDistro()) asserts absence of both install and blacklist directives; Ubuntu and OSGuard fall through to the full presence + load-refusal check.
spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Shebang restored to #!/usr/bin/env shellspec. New OS-gate test suite covering Ubuntu APPLY, OSGuard APPLY, Mariner-2.0 APPLY, AzL3 regular/Kata SKIP, ACL SKIP, Flatcar SKIP. Existing unit tests for disableVulnerableKernelModule() are unchanged.
parts/linux/cloud-init/artifacts/modprobe-CIS.conf Unchanged — file still ships in the repo and is still baked into Ubuntu / Mariner / ACL / Flatcar / AzureLinuxOSGuard VHDs.

Test plan

  • shellspec --shell bash spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh — all cases pass.
  • cd e2e && go build ./... — clean.
  • bash -n syntax-check on packer_source.sh, linux-vhd-content-test.sh, cse_main.sh.
  • AzL3 VHD content test on freshly-built VHD: testVulnerableKernelModulesDisabled AzureLinux 3.0 must PASS with all four modules absent (both install and blacklist directives).
  • AzL2 / Mariner VHD content test: existing presence + load-refusal assertions must continue to PASS (runtime apply kept).
  • OSGuard VHD content test: existing presence + load-refusal assertions must continue to PASS.
  • Ubuntu 22.04 / 24.04 VHD content test on freshly-built VHD: existing presence + load-refusal assertions must continue to PASS.
  • E2E on AzL3 regular: validator must report all four entries ABSENT.
  • E2E on AzL3 OSGuard: validator must report presence + not-loaded + load-refused.
  • E2E on Ubuntu 22.04 / 24.04: validator must report presence + not-loaded + load-refused (unchanged).

Related

🤖 Generated with GitHub Copilot CLI

Kernel 6.6.139.1-1.azl3 and later fix Copy Fail (CVE-2026-31431),
DirtyFrag (CVE-2026-43284, CVE-2026-43500), and Fragnesia (CVE-2026-46300)
upstream, so the runtime modprobe blacklist for algif_aead/esp4/esp6/rxrpc
is no longer required on AzureLinux 3.0.

Defense-in-depth: the static modprobe-CIS.conf baked into every VHD is
left untouched, so all VHDs in the 6-month support window still drop
the install/blacklist directives at build time regardless of kernel
version.

Ubuntu 22.04/24.04 and AzureLinux 2.0 (Mariner) keep the runtime apply:
their upstream kernel does not yet ship the fix. Windows was never
affected.

Updates:
  * parts/linux/cloud-init/artifacts/cse_main.sh - gate is now
    isUbuntu || isMariner (was isUbuntu || isMarinerOrAzureLinux).
  * spec/.../cse_main_disable_modules_spec.sh - new tests asserting
    APPLY on Ubuntu/Mariner and SKIP on AzureLinux 3.0 / Kata / ACL /
    Flatcar.
  * e2e/validators.go - ValidateVulnerableKernelModulesDisabled is
    OS-conditional: full presence + load-refusal check on Ubuntu/Mariner,
    defense-in-depth modprobe.d entry presence-only check on AzureLinux.

Refs: Azure/AKS#5753

AB#38070527

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the Linux CSE-time CVE kernel-module mitigation behavior to skip the runtime modprobe blacklist on AzureLinux 3.0 (relying on the upstream kernel fix in 6.6.139.1-1.azl3+), while keeping Ubuntu and Mariner behavior unchanged and retaining the baked-in modprobe-CIS.conf defense-in-depth approach.

Changes:

  • Gate CSE-time disableVulnerableKernelModule calls to Ubuntu + Mariner only, excluding AzureLinux 3.0.
  • Add ShellSpec coverage to assert the OS gating behavior (APPLY vs SKIP) across key OS variants.
  • Make the e2e vulnerable-module validator OS-conditional, doing a presence-only modprobe.d check on AzureLinux 3.0 and the full “present + not loaded + modprobe refused” checks elsewhere.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
parts/linux/cloud-init/artifacts/cse_main.sh Updates the OS gate so AzureLinux 3.0 skips the runtime module-disable calls while Ubuntu/Mariner still apply them.
spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Adds unit tests validating the new OS gate behavior.
e2e/validators.go Makes the vulnerable-module validation logic conditional for AzureLinux 3.0 vs other distros.

Comment thread e2e/validators.go Outdated
Customers reported that the algif_aead / esp4 / esp6 / rxrpc modprobe
blacklist baked into AzureLinux 3.0 VHDs blocks legitimate workloads.
Now that kernel 6.6.139.1-1.azl3+ fixes Copy Fail / DirtyFrag / Fragnesia
upstream, the bake-in is no longer needed on AzL3.

Changes:
- packer_source.sh: skip cpAndMode of MODPROBE_CIS on AzureLinux 3.0
  (Ubuntu and Mariner bake-in unchanged — those kernels still vulnerable).
- linux-vhd-content-test.sh: testVulnerableKernelModulesDisabled now
  asserts the four entries are ABSENT on AzL3 and present + load-refused
  on Ubuntu/Mariner.
- e2e/validators.go: ValidateVulnerableKernelModulesDisabled now asserts
  absence on AzureLinux (matching newly-built VHDs); Ubuntu/Mariner full
  presence+refusal check unchanged.
- cse_main.sh: updated AzL3 skip comment to reflect that the static
  blacklist file is no longer baked in either; existing in-support AzL3
  VHDs continue to carry the bake-in until they roll (no CSE-time active
  removal — by design).

No CSE-time active removal of pre-existing blacklist files is implemented;
customers on existing in-support AzL3 VHDs will get the unblocked
configuration on their next AzL3 VHD upgrade.

AB#38070527

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@djsly djsly changed the title fix: skip CSE-time CVE modprobe blacklist on AzureLinux 3.0 (kernel 6.6.139.1-1.azl3 has upstream fix) fix: remove AzureLinux 3.0 modprobe LPE blacklist (CSE-time + VHD bake-in) — kernel 6.6.139.1-1.azl3+ fixes upstream May 21, 2026
Addresses review feedback:
- cse_main.sh: drop unused isMariner branch from the modprobe
  blacklist gate (AKS does not build Mariner VHDs anymore).
- cse_main_disable_modules_spec.sh: update spec cases to match the
  new gate — Ubuntu APPLY; AzL3/Mariner/Kata/ACL/Flatcar SKIP.
- validators.go: refresh the top-level doc comment on
  ValidateVulnerableKernelModulesDisabled to describe the
  OS-conditional behavior accurately (Ubuntu: full presence +
  load-refusal; AzureLinux: ABSENCE of blacklist entries).

AB#38070527

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 21, 2026 20:42

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment thread parts/linux/cloud-init/artifacts/cse_main.sh
Comment thread vhdbuilder/packer/test/linux-vhd-content-test.sh Outdated
Comment thread e2e/validators.go Outdated
Per Sylvain's follow-up review on commit e1ae35c: the isMariner branch
restored in that commit is dead code — verified that AKS stopped
building Mariner (AzL2) VHDs on 2025-12-06 and the active build
pipeline (.pipelines/.vsts-vhd-builder-release.yaml) only references
buildAzureLinuxV3*, buildAzureLinuxOSGuardV3*, and buildflatcar*
parameters (no buildMariner*). The mitigation is also already baked
into modprobe-CIS.conf on every in-support Mariner VHD, so the runtime
apply was purely defense-in-depth duplicating the bake-in.

Gate is now: isUbuntu || isAzureLinuxOSGuard.

This unconditionally drops the mitigation runtime-apply on Mariner
nodes that might scale up via CRP-served CSE during the remaining
~16 days of Mariner VHD support (last build's 6-month window expires
~2026-06). That is acceptable because:
  1. The static bake-in in /etc/modprobe.d/modprobe-CIS.conf on the
     VHD itself remains in place on all in-support Mariner VHDs.
  2. Mariner support fully sunsets in ~2 weeks.

Updates:
  * cse_main.sh: gate simplified; comment rewritten with full Mariner
    rationale.
  * cse_main_disable_modules_spec.sh: Mariner / Mariner-Kata cases
    flipped from APPLY to SKIP. 13/13 still pass.

AB#38070527

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address Copilot reviewer feedback on commit 6b470f9:

* e2e/validators.go: the 'append the module name to the list below'
  comment was added before this PR introduced two separate module
  lists (AzL3-absence branch + default presence/load-refusal branch).
  Clarify that BOTH lists must be updated when adding a new CVE.

* linux-vhd-content-test.sh: same issue — testVulnerableKernelModulesDisabled
  now has two loops (AzL3-absence + default). Update the comment to
  say BOTH must be appended.

No functional changes — comment-only.

AB#38070527

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 21, 2026 21:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Comment thread parts/linux/cloud-init/artifacts/cse_main.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 21, 2026 21:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Comment thread spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/cse_main_disable_modules_spec.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 22, 2026 00:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread vhdbuilder/packer/test/linux-vhd-content-test.sh
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh
…ment

Address Copilot reviewer feedback on commit e12170c: the lead-in
comment said the mitigation is applied on "Ubuntu and AzureLinux
OSGuard", but the gate also applies on AzL2/Mariner (covered by the
detailed paragraph below). Update the summary line so the first line
of the comment block accurately reflects all three apply targets.

No functional change.

AB#38070527

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@djsly djsly merged commit b73f2d6 into main May 22, 2026
41 of 44 checks passed
@djsly djsly deleted the djsly/38070527-remove-azl3-lpe-mitigation branch May 22, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants