Skip to content

[build-hooks] Fix reproducible build: pin apt versions and track autoremove#25552

Merged
lihuay merged 4 commits intosonic-net:masterfrom
rustiqly:fix/reproducible-build-apt-version
Mar 13, 2026
Merged

[build-hooks] Fix reproducible build: pin apt versions and track autoremove#25552
lihuay merged 4 commits intosonic-net:masterfrom
rustiqly:fix/reproducible-build-apt-version

Conversation

@rustiqly
Copy link
Contributor

Problem

When ENABLE_VERSION_CONTROL_DEB=y (SONIC_VERSION_CONTROL_COMPONENTS includes deb), the apt-get hook has two issues:

1. Versions from versions-deb are never applied to apt-get arguments (#7502)

check_apt_version() only warns about missing package versions — it never modifies the apt-get install command to pin versions. While the APT preferences mechanism (01-versions-deb/etc/apt/preferences.d/) provides pinning via Pin-Priority: 999, this doesn't guarantee exact version matching (APT can still pick a different version if the pinned one isn't available).

2. apt-get autoremove doesn't capture package versions

The hook captures package versions before purge and remove, but not autoremove. Build dependencies installed via apt-get install and later cleaned up via apt-get autoremove are lost from the version tracking, making builds non-reproducible.

Fix

pin_apt_versions() — new function in buildinfo_base.sh

When ENABLE_VERSION_CONTROL_DEB=y, rewrites apt-get install foo bar to apt-get install foo=1.2.3 bar=4.5.6 by looking up versions in the versions-deb file.

  • Only activates when deb version control is enabled (off by default)
  • Packages already pinned (foo=1.2.3) pass through unchanged
  • Packages not in versions-deb pass through with existing warning
  • Works alongside the existing APT preferences mechanism as defense-in-depth

autoremove tracking

Added autoremove to the list of commands that trigger version capture (dpkg-query -Wpurge-versions-deb), alongside purge and remove.

Impact

  • No change for default buildsdeb is not in the default SONIC_VERSION_CONTROL_COMPONENTS
  • When version control IS enabled, packages are now actually pinned to the versions in versions-deb
  • Intermediate build dependencies removed via autoremove are now tracked

Fixes #7502

@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the sonic-build-hooks apt-get hook path to improve reproducible builds when ENABLE_VERSION_CONTROL_DEB=y, by pinning apt-get install package arguments to the versions recorded in versions-deb and by capturing versions prior to apt-get autoremove.

Changes:

  • Add pin_apt_versions() helper to rewrite apt-get install args into pkg=version using versions-deb.
  • Update the apt-get hook to apply pinned args before invoking the real apt-get.
  • Extend version capture to include apt-get autoremove in addition to purge/remove.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/sonic-build-hooks/scripts/buildinfo_base.sh Adds pin_apt_versions() to pin apt-get install package arguments based on versions-deb.
src/sonic-build-hooks/hooks/apt-get Applies pinned args during installs and captures versions on autoremove.

Copy link
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autoremove tracking: Clean, LGTM.

pin_apt_versions(): A few concerns:

  1. Grep injection with + in package namesgrep "^${para}==" uses regex, but Debian package names commonly contain + (e.g., g++-10, libstdc++6, libc++1). The + is a regex quantifier, so grep "^g++-10==" matches g-10==, gg-10==, etc. instead of the literal package name. This could silently pin the wrong version or fail to match. Fix: use grep -F with an exact match pattern, e.g., grep -F "${para}==" "$VERSION_FILE" | head -1 | awk -F'==' '{print $2}' — since grep -F treats the pattern as a literal string, + is matched correctly.

  2. set -- $versioned_args loses quoting — The subshell output is re-split on whitespace via unquoted $versioned_args. Package names don't have spaces, but option values could. Minor risk but worth noting.

  3. Options after install keyword — Args like -t bullseye will have -t skipped but bullseye treated as a package name and looked up in versions-deb. It won't match so it passes through safely, but it's a design gap worth a comment in the code.

Item #1 is the actionable one — + in package names is common enough to cause real issues.

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from 8604bc6 to d57b140 Compare February 20, 2026 01:42
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@rustiqly
Copy link
Contributor Author

Thanks for the thorough review @yxieca! All addressed in the latest push:

  1. Grep injection with + in package names — Replaced grep "^${para}==" with awk -F'==' -v pkg="$para" '$1==pkg {print $2; exit}'. This does exact string matching on the package name field, so g++, libstdc++6, etc. are handled correctly.

  2. set -- $versioned_args loses quoting — Now uses read -r -a to split into a proper bash array, then set -- "${versioned_args[@]}" to preserve argument boundaries.

  3. Options after install keyword — Added a comment explaining the behavior (option values like bullseye pass through safely since they won't match in versions-deb).

Also fixed Copilot's catch: VERSION_FILE is now local to avoid mutating the global.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

yxieca
yxieca previously approved these changes Feb 20, 2026
Copy link
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All feedback addressed — awk for exact match, proper array quoting, options documented. LGTM.

@xumia @liushilongbuaa — could you review the build-hooks changes? The pin_apt_versions() function modifies the apt-get install args to enforce version pinning from versions-deb when ENABLE_VERSION_CONTROL_DEB=y.

@liushilongbuaa
Copy link
Contributor

@yijingyan2 , please review and test.

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from d57b140 to 487acfa Compare February 23, 2026 16:59
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rustiqly
Copy link
Contributor Author

Addressed all Copilot review comments in latest incremental commit:

  1. VERSION_FILE not local — Now local in both pin_apt_versions() and check_apt_version()
  2. grep regex injection — Replaced with awk -F'==' -v pkg=... exact string match in both functions
  3. check_apt_version uses = vs == — Fixed to use awk with == delimiter matching versions-deb format
  4. set -- $versioned_args injection — Replaced with read -r -a array + quoted expansion

@yijingyan2
Copy link
Contributor

@yijingyan2 , please review and test.

Hi @rustiqly, I ran a test build with the build option SONIC_VERSION_CONTROL_COMPONENTS=py2,py3,web,git,docker,deb. With your changes, the packages versions can be correctly pinned to those specified in the 'versions-deb' file. But this introduced a new issue.
The versions in versions-deb files for docker/host images may contain +fips e.g. openssh-server==1:10.0p1-7+fips (this is because the images were built with build option ENABLE_FIPS=y). So when building these docker/host images, the version for openssh-server is pinned to 1:10.0p1-7+fips, this causes apt-get install fail with the error message:

E: Version '1:10.0p1-7+fips' for 'openssh-server' was not found

currently, we are using snapshot to pin the package versions. It may cause conflict if we pin the versions based on both versions-deb files and snapshot?
@liushilongbuaa, please add or correct anything as needed. Thanks

@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@rustiqly
Copy link
Contributor Author

Good catch @yijingyan2! You're right — when versions-deb is generated from a FIPS build, packages like openssh-server get pinned to versions with a +fips suffix that don't exist in non-FIPS repos.

I just pushed a fix: pin_apt_versions() now strips the +fips suffix from the version string when ENABLE_FIPS is not y. This way:

  • FIPS builds: version used as-is (e.g. 1:10.0p1-7+fips)
  • Non-FIPS builds: suffix stripped (e.g. 1:10.0p1-7)

Regarding snapshot vs versions-deb conflict — they serve different purposes: snapshot pins the repo state (which packages are available), while versions-deb pins the installed versions. They should be complementary, not conflicting. But if you see issues there, happy to discuss further.

Could you re-test with this latest commit?

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yijingyan2
Copy link
Contributor

Good catch @yijingyan2! You're right — when versions-deb is generated from a FIPS build, packages like openssh-server get pinned to versions with a +fips suffix that don't exist in non-FIPS repos.

I just pushed a fix: pin_apt_versions() now strips the +fips suffix from the version string when ENABLE_FIPS is not y. This way:

  • FIPS builds: version used as-is (e.g. 1:10.0p1-7+fips)
  • Non-FIPS builds: suffix stripped (e.g. 1:10.0p1-7)

Regarding snapshot vs versions-deb conflict — they serve different purposes: snapshot pins the repo state (which packages are available), while versions-deb pins the installed versions. They should be complementary, not conflicting. But if you see issues there, happy to discuss further.

Could you re-test with this latest commit?

Hi @rustiqly, if ENABLE_FIPS=y, the apt-get install openssh-server=1:10.0p1-7+fips would also fail, because there are no official versions of +fips from Debian, the +fips version is locally rebuilt.
there is another issue regarding different arch. When building image like mellanox (amd64), it failed with error message:

E: Version '3.11-4+b1' for 'grep' was not found
E: Version '4.9-2+b1' for 'sed' was not found

The versions for grep and sed are pinned to versions that are only available in arm64

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from 0c1889b to b5ec9ce Compare February 24, 2026 04:47
@rustiqly
Copy link
Contributor Author

Great catches @yijingyan2, both issues are real:

  1. +fips suffix: You're right — even with ENABLE_FIPS=y, the +fips packages are locally rebuilt and won't exist in Debian repos. Fixed: now stripping +fips unconditionally.

  2. Cross-arch versions: The +b1/+b2 binary NMU revisions are arch-specific, so versions pinned on arm64 can fail on amd64. Fixed: pin_apt_versions() now does an apt-cache show check before pinning — if the version isn't available for the current arch, it falls back to unpinned.

Pushed both fixes. Could you re-test?

@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from b8108d7 to 15f66f2 Compare March 5, 2026 15:01
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from 15f66f2 to 25ca39e Compare March 6, 2026 15:00
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rustiqly
Copy link
Contributor Author

rustiqly commented Mar 8, 2026

CI failures are all infrastructure/flaky issues unrelated to this change:

  • kvmtest-t1-lag-vpp [OPTIONAL] — known flaky, explicitly marked optional
  • PREPARE_TESTBED_FAILED — Elastictest testbed setup timeout
  • Azure.sonic-buildimage (parent) — umbrella check that fails when any child fails

Could a maintainer please re-trigger CI? Thank you!

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from 25ca39e to c615ecd Compare March 9, 2026 14:01
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@yejianquan yejianquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Three solid fixes:

  1. grep pattern fixed from ^pkg= to ^pkg== matching the actual versions-deb format
  2. autoremove added to version capture -- intermediate build deps were previously lost
  3. +fips suffix stripping for locally-rebuilt FIPS packages

Also good: VERSION_FILE made local to avoid leaking into global scope.

🤖 Posted by DevAce, Jianquan's AI Agent, on his behalf.

@StormLiangMS
Copy link
Contributor

@liushilongbuaa ms_conflict.

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from c615ecd to d336c37 Compare March 10, 2026 14:02
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from d336c37 to 3f0f93e Compare March 11, 2026 14:01
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@StormLiangMS
Copy link
Contributor

/azpw ms_conflict

@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from 3f0f93e to 5f48384 Compare March 13, 2026 02:57
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

…remove

When ENABLE_VERSION_CONTROL_DEB=y (SONIC_VERSION_CONTROL_COMPONENTS
includes 'deb'), the apt-get hook now actively pins package versions
from versions-deb instead of only warning about missing versions.

Three fixes:
1. Add pin_apt_versions() - rewrites 'apt-get install foo' to
   'apt-get install foo=1.2.3' using versions from versions-deb file.
   Only activates when deb version control is enabled. Packages not
   in the versions file pass through unchanged (with existing warning).
2. Track autoremove - collect package versions before apt-get autoremove
   (like we already do for purge/remove) so intermediate build
   dependencies are captured in purge-versions-deb.
3. Keep existing APT preferences mechanism (Pin-Priority: 999) as
   defense-in-depth alongside the explicit version pinning.

Fixes sonic-net#7502

Signed-off-by: Rustiqly <[email protected]>
Address Copilot review feedback:
- Make VERSION_FILE local in check_apt_version() to avoid global mutation
- Replace grep '^pkg=' with awk -F'==' exact match (versions-deb uses ==)

Signed-off-by: Rustiqly <[email protected]>
When versions-deb is generated from a FIPS build, package versions
may contain a +fips suffix (e.g. openssh-server==1:10.0p1-7+fips).
When building without ENABLE_FIPS=y, these versions don't exist in
the apt repositories, causing 'apt-get install' to fail.

Strip the +fips suffix from pinned versions when ENABLE_FIPS is not
enabled.

Signed-off-by: Rustiqly <[email protected]>
Address review feedback from liushilongbuaa: remove the pin_apt_versions()
function and its call in the apt-get hook. Version pinning is already handled
by /etc/apt/preferences.d/01-versions-deb (Pin-Priority: 999), generated by
update_preference_deb() and installed by pre_run_buildinfo. Having two
mechanisms was confusing and hard to debug.

Also add +fips suffix stripping in update_preference_deb() so preferences.d
handles FIPS packages correctly (previously only pin_apt_versions did this).

Signed-off-by: Rustiqly <[email protected]>
@rustiqly rustiqly force-pushed the fix/reproducible-build-apt-version branch from 5f48384 to d5f2b87 Compare March 13, 2026 14:01
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lihuay lihuay merged commit c40f76d into sonic-net:master Mar 13, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Reproducible Build] apt-get hook incorrect logic and many deb packages versions are missing in versions file.

9 participants